 Thank you very much, and thanks for the invitation. It's a great pleasure to be here. So, as David said, I'm Simon Chapman. I'm the director of Culture and Society at the Wellcome Trust. I was until about a year ago the director of the Wellcome Library. I'm also an idiot. And this is my guide to how, or in some cases, how not to build a digital library. So, this is my idiot's guide to building a digital library. And how to use a digital library to engage with communities of users, both new and existing, based on the journey that my team and I have had over the last five years. So, the work that we've done. And I'm going to make clear at this point, not least because some of my team are in the room that, although I am an idiot, they are not idiots. In fact, they are extraordinarily talented, experienced, and imaginative. They make up for my deficiencies. They're also blessed, for the most part, with extraordinary patience. And those who are not blessed with extraordinary patience are blessed, for the most part, with tactful silence. So, I can also say that I'm a lucky idiot. When I joined the Wellcome in 2010, I had about 20 years of experience, but my experience was all in museums. I hadn't been a librarian or an archivist. So, in some cases, I was an odd choice to take over that role. I had some background in the academic history of medicine, but my affiliation with the kinds of users the Wellcome Library had traditionally served. I had also had experience, particularly in my previous role as director of museums at the Royal College of Surgeons, of taking a neglected collection and trying to find new ways of engaging audiences with it. This is the Hunterian Museum at the Royal College of Surgeons, a fantastic collection that went from being a niche academic resource into one that appealed to a much wider range of users. So, I guess that that experience of engaging audiences was one of the few things that I could bring to the Wellcome Library. What was the challenge that we faced, though? Well, for those of you not familiar with the Wellcome Library or the Wellcome Trust, it's worth saying a little bit more about both of them. This is Henry Wellcome, our founder. He was also not an idiot, although sometimes he dressed a bit like an idiot. He created the Burroughs Wellcome Drug Company, later the Wellcome Foundation Limited. He gifted the entire shareholding in that company to create the Wellcome Trust. Over the years, the Wellcome Trust has divested its shareholding in that company, now has an extraordinary range of investments, totaling about £18.5 billion. That allows it to spend around £850 million, £900 million a year to support its philanthropic mission. That mission is to improve health. Wellcome exists to improve health. Beyond that, we believe in the power of research to transform health, particularly research in the basic sciences, and we believe crucially that research should be embedded in its cultural and social landscape. It shouldn't be seen as other to it. We need to understand the cultural context of health as well as the bioscience behind it. That really has shaped what we do, giving us the range of activity from basic research through to translation and innovation, through to public engagement and supporting research across the humanities and social science, not just through research grants, but also by supporting the resources that researchers need. Sue Crossley, who heads up our research resources scheme, will be known to many of you as here at this conference, has demonstrated Wellcome's commitment to working with the library and archive community to making collections accessible for use by all kinds of audiences. The scale of what Wellcome created has given us a huge opportunity. In the context of the Wellcome library, it's allowed us to build not just a fabulous library, but a fabulous collection, one which spans not just the kinds of things you might expect to find in a library interested in medicine, but a library that covers all aspects of health in its broader cultural context and not just in the context of European understandings of health but global understandings of health. Our collections are extraordinarily rich in terms of their physical content, and added to that increasingly is a large and rapidly growing body of digital content, currently about 15 million images. Although we had an argument on the train coming up, so one of my team thought it was 19 million, one thought it was 15 million. What's 4 million images between friends, we thought? That gives you a sense of the scale of what we're doing in terms of transforming our interest into digital alongside our long-standing commitment to physical collections. One of the things that's changed, though, for the Wellcome library and one of the things that created the circumstances for my appointment was its immediate context. The Wellcome library is now part of Wellcome collection, this amazing cultural venue on Euston Road which was created by the Wellcome Trust in 2007 in order to create a new space for engaging audiences with, by a science, health and medicine. Trying to tap into a market that we felt wasn't really being served, particularly a young adult audience that was being ill-served by the existing cultural provision in London. When we opened Wellcome collection, we thought it might get 100,000 visits a year. In fact, within three years it was receiving 300,000 visits a year and those numbers only kept on going up. It was a fantastic success for us but also a challenge for us, partly because of the constraints on space, but in the context of the library because it also highlighted the fact that although Wellcome collection had been through this step change, the library in many ways hadn't. Although we had very healthy visit numbers, about 30,000 readers a year coming in 2010, there was a sense that somehow the library was stuck in ways that the rest of the venue wasn't being innovative and disruptive, the Wellcome library still felt rather traditional in the way at which approached its mission. So the challenge we had was with how we might achieve a step change also in the Wellcome library. Perhaps it won't surprise you. We didn't really at that stage think about the physical redevelopment of the library, although we have since physically redeveloped the library, but in 2010 it still felt too close to the reopening of Wellcome collection for us to leap into action in terms of building work. Instead, perhaps naturally, our thoughts gravitated towards the digital. And in particular, like many of you I suspect, we had discussions with our board of governors, our governing board in which they demonstrated what I guess we could call the Google complex. Why don't you just put all of this stuff online? I think probably some of you have had those conversations. It seems extraordinarily simple, particularly to people who have no experience of putting things online. So that was the context that we started with. Why don't we just digitise all the things? Now the difference, of course, from working at the Wellcome Trust, from working in many of your organisations is that not only did our board say why don't you digitise all the things, they also said here's some money to go ahead and digitise all the things, which is a fantastic situation to be in, a wonderful opportunity. And I don't think I've ever stopped being grateful for working in an organisation where I don't have to go out and fight for funding day after day. I've been in that environment, I can't tell you what a pleasure it is not to be in that environment, and I think everyone who works at Wellcome appreciates how lucky we are to be able to do that. So when we started this journey, we not only had this injunction to digitise, we also had a fighting war fund of £20 million to help us achieve a transformation into the Wellcome Library. And the ambition we had was that we would try and digitise 50 million pages of content. Initially we said we might do that within three years. It was a hopelessly ambitious and unrealistic target. But there is something also quite exciting about taking on a hopelessly ambitious and unrealistic target in the sense of it opening your mind to what you might do. It's interesting though that despite the great opportunity we had, relative ease we had in terms of being given this big sum of money, how hard actually it was to go about delivering against this ambition. And this perhaps won't surprise you, but I think perhaps it is useful to reflect on all of the obstacles, all of the challenges and all of the mistakes. And in particular what I want to do in the rest of this talk is to reflect on some of the mistakes that I made as the person charged with leading this project. Because you'll hear a lot over the next few days about successes. You'll hear a lot over the next few days about great projects. You'll hear a lot over the next few days about all the things that libraries and archives are doing well. You probably won't hear so much about all the things that don't go well and all of the mistakes that we make and yet we all do make mistakes. And some of those mistakes actually turn out to be significant ones because they change the course of the projects that we lead and they impact on the ability of those projects to disrupt and innovate and change the ways in which we work in more fundamental ways. And so the things I'm going to focus on are the things that I now wish I had done differently at the start of the project. And this is not to criticise any of those involved in delivering the project because they've done a fantastic job and there is some amazing stuff that we've achieved over the last five years. But it could have been different and it could have been better and I don't think there is any project that we would not with hand on heart stand back at the end and say we could have made that better if we hadn't made that mistake and learning from mistakes I think is an important part of this culture that we're entering into where we are taking on new and experimental ways of working. I should say that we didn't enter this world as complete novices. I may have been a novice but my team was very experienced in delivering a number of projects on quite a large scale although nothing quite like the scale that we were about to start. So we had created one of the first digital image libraries Welcome Images using our biomedical image collections. We had been instrumentally in creating UK PubMed Central now Europe PubMed Central and digitising back files of journals with the National Library of Medicine. We digitised our film collections with funding from JISC and also created a microsite in partnership with the Bibliotheca Alexander in Egypt around our Arabic manuscripts again with support from JISC. We'd done some really great projects we thought but these were all in many ways not joined up and not transformative. They were going to piece me of attempts to pick out bits of our collection and put them online but they weren't creating a vision of a digital library. So what we were trying to do in this next step was to go beyond that and to create something that was fundamentally transformative in the way that we worked and that was quite a scary prospect. It was a scary prospect for me it was a scary prospect for the team and surprisingly having committed their funding and their backing to the project it turned out to be a scary prospect for our Governors too. So having said yes go ahead and do this they then realised quite what they would be taking on in terms particularly of IT challenges and they scaled us back. So rather than giving us complete free reign actually we had to negotiate quite hard around what we could do and what we were trusted to deliver in the initial phase of the programme and that itself was a constraint and one that I now wish I had pushed back harder against and found different ways to demonstrate our capacity and ability to deliver these projects but it did turn out to be a constraint. So what did we do? Well first of all we fundamentally re-engineered a lot of our architecture although not as fundamentally in hindsight as we might have liked. So alongside our existing systems calm, our archives database Sierra, our library management system on our discovery platform we had to create a new workflow system we had to create a new digital player and the middleware to get stuff out of our repository we had to convert a repository safety deposit box now Preservica which was designed for born digital content into a repository for digitised content. This was an extraordinarily complex project we tried to simplify it by in many cases using existing building blocks things that were there already so we did a lot of building around our existing databases and around safety deposit box rather than trying to re-engineer we decided to use Oncor our discovery platform innovative interfaces system rather than bringing in something new as a discovery platform so these were both necessary steps to de-risk the project but also introduced a number of constraints which in hindsight we might regret. So in terms of re-engineering Oncor we worked hard with innovative interfaces to build into it things like full text searching as part of the discovery layer so this is now integrated into our implementation of Oncor you can search across the full text of all the items that we've digitised we also built into it archival hierarchies so that you can see in a library system what the archival structure would be this is a necessary consequence of trying to shoehorn your archive content into a system designed for a library still strange that no one out there seems to be designing systems that work equally well across archives and libraries maybe that will change in future but it certainly wasn't available to us at the time we also had to build a player to show this stuff this is the open source player that was built for us by the software developers Digirati who became our lead development partners in all of this and we had to completely re-engineer our websites so procuring a new content management system redesigning a website as a fully responsive site a huge amount of work not least because when we started this project we didn't even have a library web manager within our team across the world come trust we had a very limited web team with no for example user experience no information architecture within that team so a lot of key skills in hindsight we would recognise as being necessary were missing from the mix when we first started we also needed to digitise some stuff and perversely we started with the harder stuff rather than the easiest always go for the low hanging fruit should be the motto of any digitisation project instead we went to the top of the tree we started with a set of archives and books books that were in copyright archives that were mostly from the 20th century all dealing with genetics all containing some quite dense subject matter not the obvious stuff for public engagement but of strategic interest to the welcome trust and to our governors we chose this for them and not for us or our users that was a mistake again in hindsight you think didn't we push for something else we did push for something else I didn't push hard enough I didn't have enough confidence in the decision to be made to back a welcome library to digital library to say actually if this is going to work we shouldn't be starting with the hard stuff we should be starting with the easy stuff I wish now I had done that I don't think it would have made huge difference in terms we built something very successful but it would have made our lives a lot easier to be able to do we did add more content in so having kind of bitten the bullet on genetics we then got license not least because of funding again from JISC I can't say how grateful we have been to JISC over the years for funding into the welcome trust one of the rare organisations that gives money into the welcome trust rather than the welcome trust giving money out but actually the money from JISC was instrumental at many points in leveraging change not just within the library team at Welcome but within our governing board as well it just shows what external stakeholder support can do to change people's minds even in an organisation like Welcome that prides itself on its independence so with JISC support we digitise the medical officer of health reports for London we also work with ProQuest and are still working with ProQuest to digitise our early European printed books a commercial project that is not deriving any revenue unlike many other partners for ProQuest but in which we are negotiating their commitment to digitisation and in return giving us rights to make some of that content freely available in the UK and worldwide from the start all the contents freely available in the UK in the developing world, some freely available to the rest of the world and all of the content will become freely available at the end of the licence term we have agreed with ProQuest we did that commercial project and with our commitment to open access again it was instrumental in buying in some support from our Governors for the sense that we were open to all kinds of ways of tackling digitisation rather than just spending Welcome's money to deliver it it also meant that strategically we could digitise a collection that might otherwise have been thought a low priority for us if you are thinking about genetics as being a high priority early European books will be at the other end of the scale you can argue about whether that sense of priorities is right or wrong but they were the ones that we were faced with within our organisation so we were able to add some richness to the content set more recently we have started work on another collaborative project this one the UK Medical Heritage Library working with ten other partners across the UK to digitise 19th century books this is working in partnership with the internet archive again funded via GISC from HEFGI so a big injection of funds to produce this collaborative project and this I guess is perhaps more reflective of where we might otherwise have started the project starting with the easy stuff out of copyright books with no sensitivities which can be mass digitised at a relatively low unit cost delivered not just online through the welcome library but through a whole range of different partners easy to access and create a vast body of digitised content on which you can start to build some interesting engagement tools and applications so we came late to the thing that we might otherwise have chosen to start with in terms of engagement we did produce some interesting ways of trying to make particularly the genetics content interesting to a range of users we produced an open source timeline for genetics but a tool that can now be repurposed for any form of content and with the medical officer of health reports we also developed tools which were designed to suit the needs of different research audiences particularly around searching and downloading data sets and marking up things like tables so they could be exported as for example CSV files to allow people to run data analysis on tables being contained within the original reports so as we went along we gradually learnt about how we might start to do more sophisticated engagement activities around our collections both with public audiences and with research audiences and most recently we've delivered two of these scrolling narratives these are broadly based on the Snowfall project from the New York Times these are scrolling narratives that explore themes using digitised content from our collections the first one is called Minecraft it looks at mesmerism and hypnotism in the 19th century the second one called the collectors looking at a whole range of different sub-collections within the material that we look after and both of these were joint developments and went out under the banner of welcome collections so very specifically aimed at a public audience rather than at a research audience and that was again a lesson for us about understanding who our core audience was as the welcome library versus who the current audience might be for welcome collection this broader body that we're part of addressing a wider public audience and again I think we made mistakes early on in the process in not being clear about exactly who our users were and who we felt best able to reach through our programs and which audiences we were best to leave to other people to develop programs for engagement on our behalf that also impacted on the tools that we were developing so we didn't always approach this project with the idea that we might create platforms on which other people might create engagement tools we started with the assumption that the engagement would be things that we would deliver ourselves and that we therefore ended up with a kind of vertical silo of content at the bottom engagement at the top but very hard for other people to break into that silo very hard for us too to bring other sorts of content into that silo particularly content from outside the welcome library so what are my key regrets from all of this first up I think my biggest mistake was not creating our digital repository initially as a separate wall garden we set out for very good reasons to try and integrate from the start our digital content with our physical holdings to have a single point of search across all of our library and archive collections so people could find anything physical or digital that is still our long-term goal and I think it is still the right goal for every library and archive to have but it does introduce some significant compromises from the start because by working with your existing systems and trying to make sure that your systems can deliver all of the functionality for a physical library or archive alongside the functionality you might want for a digital archive it makes it very hard to create the kind of innovative agile development environment that you want if you're to spark true innovation it's very hard for people to play with things if they might break the library management system at the same time we didn't create the right sandbox for our team to play in now we're starting to address this now so over the last year our team have done some really interesting projects not least an agile sprint over the summer called what's in the library working with a team of external developers to create some really imaginative creative and very quick and dirty tools to expose what's in our library collections to work outside of the discovery system and the library management system and to see what else could be done with the data but this could have been done four years ago we did not create the right environment for that to happen now there is a risk in creating a walled garden and that was the risk that we saw at the start which is if you create a walled garden perhaps it always only ever stays in the walled garden that digital is always separate from your core business it's never disruptive it never transforms what you do and there is some evidence to support that so the New York public library for example whose NYPL labs has been a fantastic place for stretching the boundaries of what libraries can do has recently chosen to go with Oncor as its discovery platform so having invested all this time and effort in developing new ways of discovering and surfacing its content it has gone back to the same system that we've been using for the last four years and disappointingly they say they're doing that because they think that's the most flexible of the discovery platforms out there so there's a huge pressure even if you do start with the walled garden approach of then being able to force that innovation back in to the rest of your work and making sure that digital does become transformative and disruptive in the ways that it can be secondly, I don't think we were very good at sharing what we had to begin with and I don't think we were very good necessarily at thinking about the role of users in our design and those two things do go together in terms of sharing it's making sure that the content that you digitise and expose is available to be used by others can be linked to other content so recently we've changed the system so that we can in fact introduce linked open data expose our collections as linked open data but we didn't do that from the start we didn't look other than by bringing content into the library repository at how we might link out to other people's collections we didn't create a useful disconnect between the repository layer all the stuff that we had and all the stuff that we brought in and the engagement layer all the stuff we wanted to do to target different audiences that might have allowed other people to play in our space and had we done that I think it would have allowed us to focus more specifically on the users that we felt we knew best and to make sure that the engagement that we designed was very specifically targeted knowing that other people could target engagement activities at other audiences so by creating a useful disconnect between the base layer of your system and the top layer, the engagement layer you allow other people to target audiences in ways that you might not imagine and you make sure that when you think about content we're not saying that what we do in the digital sphere simply reflects what we hold physically in our collections but relates to all the stuff that's out there that we might actively curate on the part of our users so again I feel that that was a mistake that I didn't recognise soon enough the idea that there might be other people who would deliver the engagement for us what we should do would be to create the platform for them and that crucial to that would be a trust in their understanding of user experience and our creation of the tools to allow them to easily access, share and link to our content and thirdly it's around the content choices I've said that our choice of genetics did constrain our ability to engage people we later shifted so as part of the work with internet archive have looked at digitising books that link into the exhibitions that are going on in Wellcome Collection again it seems really obvious in hindsight but that wasn't baked into the program from the start it was something we had to engineer in later so looking at how we might digitise particular sets of content knowing there are particular groups of users or particular interest in it but doing so in ways that don't involve creating lots of microsites with their own mini silos again a mistake that I made early on in not seeing the potential for these collections to reach new audiences and to build the kind of broad engagement that we might want so how to conclude I'm not going to say that our project's been a failure because it hasn't and I think it would be doing a disservice to everyone who's worked on it over the last five years to imply that we haven't achieved some amazing things but I think it's also important to reflect on what's not gone right and on the mistakes that we make because these mistakes will inform the decisions we make in future if we acknowledge them I don't think any of us are in a position yet where we can say that we've realised even partially the potential that digitisation offers us to reach out to new audiences and part of that is because we are very constrained by our legacy our existing ways of working our existing systems our existing commitment to delivering physical library and archive services we have to find ways of allowing digitisation to be disruptive and to transform the work that we do we had hoped to do that in the welcome library we have done a bit of it in the welcome library but not nearly enough and that's because we made some poor decisions I made some poor decisions early on in the process but now that we've recognised the potential we do need to focus on what we can do to create platforms that are open and shareable how can we avoid the need for organisations to build from scratch each time the infrastructure they need to deliver not just digitisation, but public engagement you're going to hear Robert Kiley say a bit more this evening about what we might want to do in that space but there are others in this room who I think can also play an important part of what we have in the UK the equivalent of internet archive why is there no equivalent of the digital public library of America why is there no equivalent of Trove where are the platforms that allow us as institutions not to have to develop our own infrastructure but to pour content into a common infrastructure and then look at how we might understand our audiences to invest our energy into creating really exciting and innovative ways of engaging new communities I think only by looking at this bigger picture and realising the potential for shared services to transform the work that we do will we achieve the potential that digitisation has to offer us as a way of engaging with our communities thank you very much