 I'm talking more about metadata and other things. I don't have any nice pictures of content to show unfortunately, but it's really nice to be at this conference and hearing the emotional connection that people have with collections. Ie, mae'n gweithio, fel yw'n gweithio y Metodata, i gynnwch arno o'r llwyddiadau i gweithio'r gwneud o'r ystafell y gweithio'r metodata i'r gwahoddau. Mae'r cymdeithas, rwy'n chi gweithio'r cyfnwys sydd wedi'i wneud o'r wlad ymwysig o'r uned, y Gwyl Gwyl Bwbl사eth Bwyllachol Cymwyll. It's going to be a new national service that we're putting together and it's probably going to take about two years to fully develop at a level at which it should be able to deliver benefits to the community. Its taken quite a long time to get to what this thing is. This is my best shot at one sentence, its about aggregating bibliographic data at scale and linking with some other data sources to turn it into a sort of genuine knowledge base. Can we just if I could just run the animation please this is a what we've just come up with to try and get people excited about what its vision is. It's only 90 seconds long. By harnessing the latest digital technologies and practices we're designing and building a national bibliographic knowledge base for the future. yw'n gwneud â'r cynghwil â'r bwysig o'r gwaith o'r data a'r busig y mynedd a'r gweithio gyda'r gweithio gyda'r gweithio i gweithio'r gwneudeth? Yng Nghymru heddiw yn ymlaen i'r bwysig i gychwyn i lid hills ar gweithio cynghwil a flyngech. Mae'r ddechrau Gweithreth CAN-Greithedol i gyllid am llwyddiadau yn ein cyd-ynghyrch i'r bwysig a'r ddechrau cynghwil boeled yma y bydd y bwysig i gefnogi ...on ffactors like historic usage and the commonness or rarity of items that are held by libraries right across the UK. The NBK will be a genuine knowledge base combining data from various sources. It will also enable contributed library data to flow into other systems and appear in global search engines. This exciting collaborative partnership is an important part of building a national digital library. To disgyst y gallwch chi ddweud, contacthelp.coPAC at jisk.ac.uk. Jisk, maw power to you. Right, thanks for that. So, this is the context of what I'm talking about, this type of metadata. It's based on, or this initiative comes out of a report we commissioned a year ago. And this is some of the recommendations. The primary use case really is about supporting collections management. But there's also a resource discovery element within these recommendations as well, which is basically the second set of these six recommendations really. And this is the reason that we need to be thinking through this notion of can we share this metadata as widely as possible. I can't read through these, but basically the question that I've been asking myself and asking others and have been kind of convening groups to discuss over the last year or so, one of the questions I should say, there's been many questions in relation to this NBK. But can we assert an open licence for all of this bibliographic metadata and why would we do that? Is it feasible? I kind of have come to some answers in my head for most discussions. They're possibly not the ones that I would like. So I'm still kind of asking the questions. I'm wondering still whether it's feasible, but let's take a step back in case it's not kind of self-evident to why we'd want to assert open licensing on this data. Let's just ask these questions. I mean, why would we do it? Who would benefit and what is actually stopping us from asserting this level of licensing? So why should we do it? Perhaps this is familiar to people. I don't know. There's sort of five star steps towards open link data. This was very much all the rage a few years ago. I think link data and that whole notion kind of ebbs and flows a little bit. This was certainly going around the conferences in sort of 2011, 2012. It's got kind of five steps towards the holy grail of this open link data. And I think there was some kind of almost religious fervor about this a wee while ago. I think that's kind of calmed down a bit now. And people, I'm hearing people just say, well, let's not completely focus on that whole five star open link data. Let's just think about link data in all of its formats and the ways we can do it. But I think it is useful to have these steps and in terms of what it enables us to do, the step down from opening data is using uniform resource identifiers. And I think this is really important. I'll go on to that a little bit later. The open format and the structured data, that allows various other things to be done with the data. And indeed, we've got that with library mark data. It's obviously an open format, and it's very structured. So that should stand us in good stead. But obviously what I'm talking about here is that first step down at the bottom there, the open licensing. I'm not sure whether they strictly build on each other, but it's obviously going to be the best foundation we could have for achieving this kind of distribution of data that we would presumably prefer, not just for discovery and linking, but for pushing the data in any direction it wishes to go. We took a view, and take a view, about the NBK that really if commercial providers want to get hold of the data, then they should be able to, and this data should surface for users wherever users are. And what's the prize here? Well, this kind of gets to this notion of the title of my paper about linking communities together with metadata. This screen here is a screenshot from some work that Gisk the Archives hub and COPEC did a few years ago. And it's basically bringing data from those two services, from the library side and from the archive side, and it's allowing people to push data together, mix it up and slice and dice it. And this is a mock-up, a suggestion of how that data could be brought together. So I like this word, the recombinant possibilities of data. I think once you've got it in these atomistic parts and you can start linking and doing some clever things with it, I think it could really drive some novel research, certainly some novel ways of visualising the data. So this is where we could go if perhaps we did carry on this investigation as to how open we can make this data. And this is just kind of an acknowledgement that are we still talking about mark data? It's been around for decades and decades obviously, different formats, and we are aware within Gisk obviously that it's likely that, certainly bibliographic formats are changing and will change, and we would expect the service provider that we want to work with to deliver the NBK to be very alert to these new developments and new formats coming along. So that second question, who would benefit? I think perhaps it's fairly obvious who might benefit if we can, as I said, slice and dice that data in ways. Well at the moment there's lots of research, there's lots of evidence to suggest that users, end users, perhaps particularly graduates, undergraduates, they're looking very much towards Google and web scale search engines and commercial library discovery services and they're not really looking much further than that in terms of where they go for their data and what they rely on. And of course, in a large scale, commercial entities are very well equipped to drive users towards other commercial entities and so what we're really hoping to do with the NBK and with this focus on data is to clear some of that opacity with these systems and really try and bring the libraries and by inference archival data perhaps as well. Make it more visible on the web and this is also something that I know is OCLC. I've been talking about it a lot recently, they talk about this library-shaped black hole on the web. But if we can manage to make these days more open and push it around a bit more, we assume, we hope that we can get users to be engaged more with library data to rely less on maybe having to buy things and students having to buy their own copies and make more use of what libraries are obviously there for which is to push resources around and allow borrowing and referencing of resources. And also, if we're looking at open data, we would also expect that to have an impact on discovery services and there's a very interesting initiative at the moment that is being pushed along by EBSCO and other organisations called Folio, the future of libraries is open. And so that's hopefully going to promote and support this notion of more open discovery services. And along with that, we would expect hope that a richer data ecosystem of library bibliographic data will also include pointing people and directing people into open access books, which aren't really that visible in commercial discovery systems at the moment. I would also carry on this theme of linking across communities and allowing the library data to hook up with other forms of open data, such as Europeana and the European Library are interested in making available. So I think that sort of sets out who might benefit and why we might do it. So the question perhaps is what is stopping us. Why can't we just simply say, okay, it's library data. It's a lot of it's quite factual and you can't claim rights over factual information. However, there's a lot of other information that creeps into the data ecosystem as I'm beginning to call it, that entities have put effort into actually adding to those records and there are commercial interests around mark records. It is a business and although one could argue that JISC might be or should be seeking to disrupt business models that don't work, I don't think JISC is really in the business of breaking or breaking businesses. So, yeah, I'm enough. 80%, I know this is true for the British Library, perhaps others as well, 80% of their records that they acquire or they purchase from commercial data suppliers. And I believe that figure for records is right in many cases as well. And these are some of the organisations, of course, that do supply these records. But it's not just that really. It's not just that it's a commercial business. It's also quite hard to actually really understand and drill down into the provenance of these records. And JISC and RLUK have sort of kind of add to the complexity of that. We make mark records available to RLUK members. And the Library of Congress make a lot of records available to some of these suppliers and to libraries. British Library have a role in this ecosystem as well. And so there's a lot of kind of interchange of data between and records get added to incrementally as they go through these different organisations and are handled by these organisations. And after a while, you see records and just, it's, as I say, very difficult to know how some of these records have really come into being where they've passed through and what processes. So I guess if we've come to the conclusion that it's perhaps not easy at the moment to see a way of confidently and decisively asserting an open licence for these records, what can we do perhaps to reduce what I'm calling here? We're often referred to as kind of data quality issues in terms of the bibliographic data. But it might also be kind of characterised as sort of data inefficiency in as far as there's a great duplicate effort going on across libraries and cataloging going on over here and cataloging going on over here. So yes, how can we perhaps increase efficiency and reduce that duplication? It seems likely the kind of most promising way forward or a slightly different take on how we can make sure that organisations, institutions, libraries are kind of relying on a single kind of canonical source of information is very much to focus on this notion of identifiers and the use of identifiers. Been around for a long time, some of these frameworks. This is VF and I put in the search term JISC and got this back, the Joint Industry Stunt Committee, which is quite interesting. I sometimes feel working just like I'm stumbling around on fire like this poor chap down here. But I think it is useful to be able to differentiate between us and these stunt people. And so VF is a variety of identifiers and this, oh, I should talk briefly about URIs, anybody who's not familiar with URIs, which featured in that one of the fourth step of that link data diagram, essentially is the way of asserting a uniqueness to something, someone in this case me, a green picture of me. I've taken this URI from OCLC, kind enough to uniquely identify me. Or it could be a place, this is Oslo, this is their version, their identifier for Oslo. So it seems to me and I think others as well in the British Library are very, very instrumental in pushing ISNI in the international standard name, identifier seems to me that we need to get very, very engaged, particularly with this scheme. This, Jisk is doing a lot of work with Orchid as well, that's a researcher identifier scheme. And in fact Orchid is a sort of the subset of the ISNI framework that deals with standard names, names for individuals and so names for researchers and also names for organisations over here. And we will be keen to push the service provider that we're working with on the NVK to very much engage with this and other identifier schemes so that we can try and try and build momentum towards making sure that we're reducing duplication as much as possible. So, yes, is this feasible? Well, perhaps not right now in terms of that licensing question. I haven't given up on it, as I said at the start. It's possibly not the answer I was hoping for when I started asking these questions of various groups. But nonetheless I think that there's plenty to be doing in the meantime just to get this system built. Obviously in the first phase it's going to be trying to reach out to libraries and trying to aggregate as many different data sets as possible, catalogue sets from libraries. At the moment we've just run the COPEC service. That's bringing together data from about 90 libraries for the NVK. We're looking to increase that dramatically. We want to work with over 200 libraries and we'll be over the next couple of years reaching out to academic and specialist libraries in the UK and asking you to kind of participate and help us to build this system. We want to try and, whereas COPEC in the past is kind of by need, has to sort of throttle the way that it kind of engages the libraries, we want the NVK and working with a large-scale service provider to open this up and be able to aggregate much more data. Okay, and I think that's me done. Thank you.