 So, good afternoon. I often find when I come to conferences that I'm one of the few people or come to library conferences and in fact many museum conferences that I'm one of the few people who comes to talk about science. Often times science doesn't get quite so much of a look at it in the museum tech conferences I go to. So I'm very pleased that now I've got a library project that I can come to library conference and sneak a bit of science in there. So the Biodiversity Heritage Library. At its heart the Biodiversity Heritage Library is an aggregator that aims to provide full text access to scanned literature. In these couple of images you can see rare books opened up and out of the compactuses in Museum Victoria's library and rare books are certainly very well represented in the Biodiversity Heritage Library. But we've also got there's also many many journals many serial titles and the journals were originally scanned to the extent that copyright law would allow but now increasingly publishers in institutions many publishers and institutions that we're now going back to are agreeing to sign up to provide access to in copyright literature as well and particularly the institutions who are self publishing self publishing serials so for example our museum publishes its memoirs the Australian Museum publishes its records a number of a number of institutional publishers are quite a more than happy to have their in copyright runs sometimes you get restrictions maybe institutions will say okay we're still making money out of the last two years worth of journal titles for example so you can't have you can have up to two years ago but many institutions now are saying saying yes please take them and put them up so the site has been running for a number of years now it was originally started as a consortium of 15 libraries based in museums and herbaria around the United States and in the UK these figures are down the down the bottom of 88 thousand titles that's the overall corpus of the whole biodiversity heritage library so over 45 million pages are now in the biodiversity heritage library and how did they get there so getting getting all those pages has been done pretty much in the way that Brewster was describing this morning getting a getting a scanner and scanning page by page at Museum Victoria we've been really really grateful to and very lucky to have a team of five volunteers who do pretty much all of the scanning for us they come in three days a week and they do the scanning they do the post-processing and all the upload at our site librarians make the choices about what's going to be scanned the volunteers then do the scanning and we've got a project team of three people working part-time to keep things running so with it with that volunteer contribution we've contributed over a few years 295 295 items from 74 titles so there's a number of journals that we've contributed in a part in there and nearly 70,000 pages given that that's being done by volunteers I think that's a pretty impressive effort different organisations around around the world are doing their scanning in different ways so the Smithsonian who's one of the very big contributors and one of the original founders of BHL for example has a what they call cottage industry scanning like this actually on site but then they also wrap up whole trolley loads of books and send them out to one of the internet archive scanning centres to be done in bulk so most of the public access to the biodiversity heritage library is through a central website whose name I didn't actually put on the site but it's biodiversitylibrary.org and there's also a couple of language specific nodes that I'll mention briefly later in the talk it was very important to us that the site be free and open the organising principle of BHL is that it's public access and no registration would be required no signing up and that the access to the content would be completely free. One of the principles of originally starting the project was to think about how not everybody can physically access the great libraries of the world and even the little libraries of the world. They might be in a country that maybe doesn't have great libraries themselves in that country they may not be able to travel they might be in a rural or remote area and not able to access physically access a library or for science researchers they might simply just be out in the field and want to get those book what want to get there the literature that they need well these days on a tablet or laptop PC so one of the very important things that we've been doing with BHL is trying to develop a financially stable future so that it can remain free and that's been a critical part of the activities that have been undertaken by the secretariat for BHL which is based over in the US at the Smithsonian. So I'd also like to give a hat tip to a very important partner and friend to the BHL except that Brewster's left the room and so it doesn't get to hear it himself which is the Internet Archive and the Internet Archive acts as a really vital conduit for literature being uploaded into BHL so many although not all the volumes in the BHL can also be found in the collection in the Internet Archive what we've done is developed a methodology where the scanning is done wherever the scanning is done right around in libraries around the world a custom-built piece of software called McCall was developed at the Smithsonian that's used to then paginate your scanned images and then the content goes up into the Internet Archive. The advantage of the Internet Archive is that they offer services such as the OCR service which means that individual providers don't have to do that all we have to do is upload the image files and then the Internet Archive does the rest for us in terms of making EPUB versions, Kindle versions and doing the OCR. So this particular screen shows a book that was uploaded by Museum Victoria which is called A Naturalist's Miscellany and here it is in BHL. One of the things that we really value about BHL and being able to put these old volumes in is really opening up the access as well to the images and I'll also talk about that a little bit later. The Internet Archive has also helped the BHL by supplying the machinery for digitisation in the form of scribe machines and again Bruce to mention that earlier mentioned those machines earlier. So in these couple of images on the left-hand side you can see one of the very first scribe machines that was provided by the Internet Archive to the Smithsonian library back in all the way back in 2007. The other image on the right shows Robert Miller from the Internet Archive talking to colleagues in Kenya and BHL in Africa is one of the most recent recipients of scanning rigs. What the scribe scanning rigs do which is a little different to just setting up a scanning or digitisation rig of your own is that it has the software and the software inbuilt to upload straight into the Internet Archive so you don't have to do so much post-processing. Once the pages are cropped and deskewed and numbered then all you do literally is press a button and off they go which makes it very convenient for passing that content through. So as I said I was going to put a little bit of science into this and the Biodiversity Heritage Library was originally started, the original aim of the project was to help support scientists and particularly taxonomists to do their work of naming species. And so one of the features of BHL is seen over here where we run a series of algorithms across each page to look for scientific names and to look for or at least look for strings of text that look like scientific names. Those scientific names are then cross-linked where we can back, there's a tiny little logo there which is the Encyclopedia of Life logo. The Encyclopedia of Life which has a catch line of a web page for every species. If we can find a species name in the Encyclopedia of Life and link it back into the literature then that gives two-way links from the literature into the Encyclopedia of Life and back again. Taxonomy is a discipline that's somewhat unusual in the sciences in that heritage literature is just as important as current literature. If you're in the business of naming a species or looking at a particular group of animals or plants and the animal that you're looking at that you think is a new species is very similar to something that was described back in the 1890s, well then it's off to the 1890s literature you have to go to go and make that comparison. So the heritage literature is actually very important to taxonomists. However, as I've just recently discovered heritage literature is, this is sort of often a little tangent, heritage literature is important for other disciplines as well. So it's not only taxonomy that benefits from access to the literature. This paper was published in ARXIV, the Physics Archive, and talks about how heritage papers in the physics discipline is now being referenced and what they're finding is that the more papers come online the more the old literature is being referenced and so it's actually coming back into use simply through being made more available. So that was just a little bit of a, that was a little bit of an aside. So we have, as I mentioned, other important partnerships. I've already talked a little bit about the Encyclopedia of Life. The other project which is sitting below the Encyclopedia of Life is called Biostore. Now Biostore is a project that's run by an individual academic, a guy called Rod Page at the University of Glasgow. And what Biostore does is it tries to address the mismatch between what scientists, the level of cataloging that the scientists require and the level of cataloging that's available within a library cataloging system. So a researcher looking for references that might be relevant to the work they're doing is looking at other scientific papers in scientific journals and are seeing the actual papers. What tends to happen in libraries is that you will get that journal. Yes, yes, we've got the Journal of Crustacean Biology. It's on the shelves over there. Off you go. Rather than the researcher being able to really provide assistance to the researcher who doesn't just want to know that you've got the Journal of Crustacean Biology. They want Volume 32 and Page 60. And so the level of library cataloging, there was a mismatch between those. What Rod's done is taken bibliographies published in scientific papers and has tried to then go back and re-articulise, if you like, or create tables of contents off digitised versions of journals. And so he's managed to put the journals back together again and then has fed that back into where he's managed to find articles, put that back into the BHL, so that now researchers can actually search on article titles rather than just on journal volumes. One interesting thing that he notes is that sometimes you get a really good idea about how popular a paper has been or more to the point you get a good idea of how unpopular various papers have been. If he can't find a single bibliographic reference to that, to a particular paper, it stays blank in Bios Store. And so there are certainly gaps within Bios Store where nobody's referenced that paper. And so it just sort of doesn't appear. I'll also just quickly mention the Digital Public Library of America. And the BHL has been another one of the founding partners of DPLA and contributes to that service since it's been live, since it went live in 2013. I sort of alluded to the fact that BHL is a global collaboration. And although it was started originally in the US and the UK with a consortium of libraries in those countries, it seems it expanded really through very active activities by the, particularly the US team to get other people on board. So BHL Europe now encompasses most of continental Europe and is run through the library in the Czech Republic. BHL Egypt is run through the Bibliotheca Alexandrina. There's a node in China. CLO down in South America is a BHL node run out of Brazil. BHL Africa is run out of South Africa, but there's most more recently, Kenya has also started to contribute to digitised volumes. And there's an Australian node as well, which I'm the project lead for. And most recently, the most recent member to join is in Singapore. So it really is a global initiative, even though those are very big logos on a very kind of manufactured map of the world. So there's still many places that aren't adequately represented. One of the great things about being part of a global collaboration is the opportunity to meet with BHL partners. And I think one of the ones being able to actually meet in face to face at conferences and at meetings really is one of the reasons for the success of BHL. The partners have learned, have certainly learned an awful lot about each other, about different processes. And we've all always been very impressed at how much enthusiasm and passion different partners bring. So on the left hand side, you can see that we from our fezzers might give you a hint that we met in Morocco. And earlier this year we met altogether in Australia. Next year we'll be heading off to Brazil. As I mentioned, some of the global nodes have particular language, specific aims, challenges and issues. So BHL China is concentrating on digitising texts in Mandarin and providing and so they have their own website which provides which which is in Mandarin and provides tools for researchers in Mandarin. One of their particular challenges was doing OCR in Mandarin and getting back OCR results in Mandarin. And there's had to be quite a lot of tweaking of the OCR engine to actually allow it to cope with Mandarin language texts. In Egypt, as I said, the Egypt node is based at Bibliotheca Alexandrina and it's focusing on Arabic language texts. One of the stories that they tell about OCR is that what they found was they were trying to improve their OCR quality. In the end, it turned out it was the page cleaning software that they were using that was the problem. So the page cleaning software was looking over the pages and it was taking all the dots out, viewing the dots as just dirt on the page and so completely changing the meaning of the words on the page. So once they kept the page a bit messier, the OCR quality actually got better. BHL is also trying to increase its audience and reach out to other non-science potential users. So we think that BHL has reached the however many static number of taxonomists there are in the world and now it would be great if other people started to use it. So there's a very active program for making images, taking the images out of the historic texts and making them available for free for artists, illustrators, scientists, educators, whoever wants to use them on Flickr. And I'd suggest going and having a look in the Flickr site because the number of images and the variety is just extraordinary. BHL also maintains a very active presence on social media and also has a very active blog and has really interesting things like blogging about most recently sea monsters called Releasing the Krakens. More recently BHL also started to look at archival material, we could call it grey material, we could call it archival material, such as scanned copies of primary archives such as correspondence and scientific field notes. So they're now starting to go into BHL as well. The big difficulty with handwritten material is that OCRs have absolutely no help to you. So the BHL, like a number of other services around the place, is using the sourcing help from volunteers to try and unlock those records. So they're putting up material into projects like this one, the DigiVol project in Australia, which is run by the Atlas of Living Australia and the Australian Museum and sourcing transcriptions. Our museum is also doing the same and has started to experiment with uploading some field guides out of our collection. And a third project that the BHL is also doing is looking at OCR correction and how you might be able to source volunteers to help with that and in fact gamify it as well. So I've mentioned the Atlas of Living Australia a few times. So BHL in Australia is the literature service for the Atlas of Living Australia and the Atlas's aim is also to develop an authoritative, freely accessible, distributed, federated, biodiversity management system. Within the Atlas there's a page for every species and within that page you get photographs, descriptions, maps of museums, specimen occurrences, names and classifications, sound files. Soon we'll be getting, including genetic sequences and as part of the species profile we also have literature. So on the literature tab we find the name references found in the Biodiversity Heritage Library and links through to the BHL. And to add in the local content we also reference and use Trove which Amanda also talked about, the National Library of Australia's aggregator. So we use the Trove API to bring back matches into the Atlas of Living Australia. Species names have a very frustrating habit of changing or the species names don't change, the taxonomists change their mind about species names and so we've also written some tricky matching in there. So for example the red kangaroo which you would think wouldn't change its name very much over time but in fact has changed its name a number of times over the many years. So what we do is to actually look for all of those different species names when we send out a server, send out a call to BHL thinking back to the BHL website where you had all those names listed. The names, synonymising those names gets very complex so what we try to do is just bring back matches to absolutely everything we can find. So BHL in Australia provides a digital literature service for the Atlas as I've described. We also do the new scanning projects as I've described as well and we are active partners in the global partnership. We've just started a new project over in Australia and one of the major aims of that project is to assist other institutions who want to start doing their own BHL project and we'd really like to extend the collaboration both within Australia and if New Zealand would like to and we've had a number of inquiries from institutions in New Zealand who've talked about the desire to perhaps get a BHL note up in New Zealand we wouldn't we would not at all presume to say that we would do something for you but we would but what we would like to extend would be the offer of providing any assistance that we could if libraries within New Zealand wanted to band together or even go separately and start a BHL note over here and I think it would be there must be an awful lot of valuable scientific literature over here. I did a quick search in in the existing biodiversity heritage library and and what what we find for New Zealand and what we find is what we found when we started the Australian project which is there's already a lot of content in there relating to New Zealand. This is because the project when it started was done by 15 really big libraries over in the US mainly and they just scanned everything they weren't to the extent the copyright would allow them to they just scanned whatever was on their shelves so there's Australian literature there's literature from countries all around the world including a number a lot of content from within New Zealand so there's a good base to already start from and that I think I think with that with that offer of please come and talk to me if you think there might be any opportunities and we'd be we'd be really really happy to start a discussion. So thank you very much. Thank you Allie. So we have we have a bit of time for questions. Does anyone want to hear? You mentioned that you contributed 74 titles. I believe there's some international master list that dishes out who's doing which titles and you would hand up to do one. How does that coordination and prioritisation work? One would hope that there was a master list. The Europeans were working on a master list until Europeana funding ran out and so now the master list has kind of gone by the wayside a little bit. We got a we did actually get a BHL Australia list up and running a bid list up and running but then we found that we didn't actually have enough institutions who were bidding so that's also fallen out of out of use. Now it kind of comes down to if there's a title that we want to have on you look to see if it's already there and many times it is. Particularly if it's serials. Less so for monographs and then we talked to it's sort of the the US team is now kind of managing the master list. There's not so much scanning being done in Europe now. They're still kind of working on how to actually make the aggregator work and how to make it work with Europeana. So the main scanning is still being done in the US and then the specific language scanning being done in other countries. So it's Africa Australia and the US who are doing English language scanning. So between the three of us talking is kind of how the bid list operates at the moment. That's right. Absolutely. Do you have your own platform in Australia or you're using the American one and uploading everything that's how many different actual separate platforms are there? Separate platform. So the so what ended up happening was that Australia built our own website and our own platform. The the Atlas of Living Australia ran out of money at one stage so couldn't fund satellite projects. We'd also ever when we put up the Australian platform the US team got a bit of platform envy and all said oh this is so nice it's so nicely designed so now the so what you so one of the last things we did before the funding ran out was to merge the US and the US and the Australian platforms. So the design of the of the website that you see is that actually the design of the Australian platform now placed on top of the US system. So in terms of how many platforms there are Africa, Europe, the US and Australia all contribute into the central the central platform. Brazil maintains its own Egypt maintains its own China maintains its own and Singapore's only just starting so I'm not quite sure what decision they're going to make. What we are trying to do is encourage people to even if they've got their own platform make make sure that the content's available in both places so that people don't have to understand quite so much that oh if you want this particular journal you've got to go to this one and if you want this one you've got to go over here. Yeah so no there's no separate Australian platform any longer. What information do you have about your users? That could be explained in a scientific audience but has it gone wider? Certainly with the Flickr group it's gone much much wider than that. So now ever since they started putting the images out it what they found is that it's really the images that that a wider group of a wider audience is really fascinated by. But then that brings in historians as well so you know historically how species some of the literature that goes back to the kind of 18th and 19th centuries even just you know in the the the era of the Wunderkammer and the era of the gentleman naturalist just looking even at the language of how things were described back then I think is quite a rich trove for historians to to mine and research and look at. That's it so please join me in thanking Ellie again.