 Rwy'n credu'r pryddysgwch yn ddweud o ddod yn eich ddweud o'r llyfr arferwyd y Llyfrgell Cymru, ac rwy'n gweithio'n cych ond y cyllid-dynys i'r cyfnod sefydlu, a'r cyllid-dynys i'r cyllid honno'r cyllid sydd wedi'i ddod y ddod yn ddod yn gweithio'r cyllid, felly mae'r cyllid yn gweithio'r cyllid, a'r cyllid yn gweithio'r cyllid yn gweithio'r cyllid. First thing, I bring greetings from Carnegie Mellon just by way of introduction that times higher world rankings place us in the top 25 universities in the world, therefore they are the default university rankings that we should all believe in. We have particular strengths as many of you may know in computer science, engineering and technology and in the fine arts. The university was born out or took its roots from its foundation as Carnegie Institute of Technology founded by Andrew Carnegie in 1900. 50 years ago this year the institute merged with the nearby Mellon Institute of Industrial Research to found the university we know today. It has strong Scottish roots. The photograph taken outside my office represents some of the Scottish traditions sounds that you will see on campus most days. There's a requirement in the faculty handbook that anybody from CMU speaking in public does so with a Scottish accent. It's taken me about four years to shake off my Wisconsin roots, but I'm kind of getting there. So a couple of years ago, you're going to be a good audience, a couple of years ago I was asked to contribute a section to the university's strategic plan on libraries. And as we talked through what that might look like, we came up with the notion of the 21st century library. I'd been to enough conferences labelled 21st century library to know it was a thing even so I couldn't quite describe what it might look like, which is what everybody asks me. What does the 21st century library look like? I truly don't have a clue. But what I do believe is that it marks a shift from our role as the campus community's primary information provider to something different. The reality, particularly in a university focused on the disciplines I mentioned earlier, is that much of the information content required by our students, our faculty, our researchers exists on the network. They don't come to the library to acquire it. Frankly, I suspect they find SiHUB a faster way of getting to things than coming through our channels. But nevertheless, we have really become a rather boutique sideshoot of the university's procurement office. So if you take our role as information provider as something that we have on life support rather than our critical mission, what is the role of the library in the 21st century? And that's kind of what we're trying to address through a number of initiatives that are underway. One of those is particularly what we're going to address in this presentation. I'm glad to see Lorcan Dempsey sitting there. He has written and spoken on a number of occasions about the shift of the workflow in the research enterprise. The recognition that in the 1990s and before then, the researcher built their information workflow around the library. They came to our buildings to work with librarians to access content to keep up to date. Today, for most researchers, they have an information workflow that exists entirely outside the library. And one of the challenges for us is to ensure that we integrate with the researcher workflow. We need to make sure that our services, our tools, our technologies fit into the way that today's researchers work in the online networked information ecosystem. As we think about how we might move forward, we recognise that the skills that made us successful as information providers have immense relevance in today's world. But we need to keep on top of the changes in the environment and understand where our skills might align to allow us to do things for our researchers that are done more effectively than anybody else in the marketplace. Two of the trends that I observe out there are those of open science and the evolving scholarly record. Open science has become significant over the past few years, partly because of the growing expectation that those who pay for research will have access to the products of the work they fund. We see a greater expectation of the ability to reproduce the findings of research. We want to make researchers accountable for the results that they present to the scientific community. We know that the internet has democratised many aspects of our lives as citizens and there's no reason why science should be any different. And we know too that open science can increase the visibility, the impact of a university. It's a bit of a stretch to say that it will drive us up the world rankings, but it's an important marker certainly of a university's presence in the scientific world. Another product of the open science environment is the way in which the scholarly record has morphed from being focused primarily on the outcomes. Think about the journal article as the outcome of a research project. And instead we see many of the other artefacts of the research process as being amenable to dissemination, curation and repurposing. Whether it's the research process and the protocols, the evidence, the community conversation that are all taking place in a digital world or the community review, reuse and repurposing of the outputs of research. All of that forms a holistic whole that it's important for us as research stewards to manage. So as I think about how our libraries move forward, there's a bucket of work around our role as supporters of the student experience. Whether it's about repurposing libraries to become more in line with 21st century learning activities and needs. Or whether it's about helping students navigate the increasingly complex information landscape. We know there's a lot to be done there, but we know too that there's much to be done to support the world of open science and that increasingly distributed evolving scholarly record. When I look at the commercial world I see players like Elsevier become very adept at populating the research life cycle. Many of the tools that they have built or bought are trying to build together a one-stop shop for the researcher, for the research institution and arguably for the research funder. But libraries in my sense have generally been less adventurous and many libraries probably could start with a life cycle and point to the publication space and talk about the collections that we make available and perhaps the open access institutional repositories that we provide. But we haven't necessarily been adept at covering a broader sweep of the life cycle. But remember my opening point about how if we are to be successful we need to integrate with the researcher workflow. This is the sort of stuff that the average researcher navigates every day. Increasingly they are turning to a variety of tools and services that they access individually because they are convenient. I'm not going to try and stop them but we need to understand how we fit in if we are to be accountable to our institutions to curate and showcase the scholarly work that our institution produces. And if we are to be accountable to the funders and others who place mandates around the disposition of data and publications emerging from the research they have funded. We spent quite a long time looking at a variety of solutions that might try and take us forward. We thought about building things in-house. We road tested a number of commercial products and earlier this year we announced a partnership with Digital Science where we agreed to implement four of their main products, symplectic elements to be the campus research information system, fixture to be our comprehensive data publications and anything else digital repository, alt metric and dimensions. And we've started to begin to map out what our take on the research workflow will be in a way that helps our campus community understand fairly simply what we are offering them. You can see from the slide that we are not just in partnership with Digital Science, I don't want this to be putting all of our eggs in one basket situation. But what we recognise is that by following this approach we can perhaps drive what is painted, I hope you can see it, on the world's most painted object, the fence at Carnegie Mellon which students decorate almost every night impact. That is what we are trying to help our faculty achieve and we believe that the most appropriate way we can help them do that today is by helping them showcase their work and to take us on the journey. David. Thank you Keith. So I'll be talking now about how CMU has moved towards what we call or referred to as our comprehensive institutional repository which is very pointed as Keith pointed out. These are topics that CMU has been dealing with for a number of years. But there was a publication that came out through this organization this past May that highlighted the idea of looking at strategies for institutional repositories and rethinking how institutions view the repository. Both as a standalone entity but also how does it integrate potentially with the broader needs and capabilities of an institution. So in that report there were three institutional perspectives that were covered and three of which are actually ones that we'd like to be able to focus on today. The first one is noting that the repository needs to be thought of not as a standalone entity but also how does it fit into these broader strategic initiatives of an institution. And then thereby doing so how does the institution use the repository to showcase this work. Secondly it's something that we've talked about very well through this organization is that there is a path for institutional repository to be seen beyond just as a repository that is a platform. But actually seeing a repository that can go from that to a bundle of services or eventually to a bundle of related services. I think one of things that we're looking at though is trying to think of the repository both as a bundle of services but also as a bundle of interconnected services. And then the last point was there was a highlight from the members of the executive round table that one institution was using fig share as the repository. So we're here today to talk about that process. But before we do so we should kind of note some of the historical context of repositories at CMU. So prior to the adoption of our new repository repositories for CMU fell into three different categories. Whether it be our archives repository built on archivalware or what's now known as Novation. Our traditional IR for publications, theses, dissertations, great literature, technical reports which was powered by digital commons from B-Press. And we lacked a data repository. As some may know with digital commons it can be used both as a repository platform but also as a publishing platform. And while CMU was very highly focused on the IR side of digital commons there wasn't a lot of work being done on using it as a publishing platform. At the time it really didn't fit the needs that we were hearing from our campus constituents and trying to fill what they really needed which was helping to fulfill the data repository solution. So this is really what kind of was driving our need to look at our repository currently but also what else was in the space be open source or another vendor solution that could help us to fulfill that need. And as Keith mentioned we did an environmental scan of the space parallel to what was going on to the broader institution of looking at what was needed for research information management. And we went with our partnership with Digital Science through FigShare. One of the things that we did though was in that entire process this was not solely something done by the libraries but involved many different units across the institution. All the way from the administration with the president provost head of research all the way down to individual faculty members. And this was something that we wanted to make sure we carried through the development of a repository even to the simple thing of giving it a name. So one of the things we did was we ran a contest on campus to name the repository with a small prize and you can see from some of the numbers that we have it was pretty expansive where we were getting coverage and involvement. And we felt this was an early way to develop ownership of the repository by campus. I know many of you who work with repositories or work around them. You maybe have heard where faculty are very unsure of what this repository is. They are not sure what's going on. You get contacted because they get an email because their materials are in the repository. This was a way to kind of deal with those early issues and have the campus take ownership of what the repository was going to be. Which then led us will continue with the Scottish theme to Kiltub. So this is our central comprehensive repository. It's powered by Figshare. So you'll see that it looks and feels very much like Figshare. And there's a point to that which I'll allude to in a minute. And then the card on your right is one of our promotional materials that we've created to engage with campus. And this idea that the repository is there to weave the fabric of your research. And we'll talk about why is that phrase important and how has it actually empowered faculty to think about what they may put in the repository. We've also done quite a lot of engagement around how to use the repository as far as creating guides and tutorials, informational pages. So making sure that we're engaging with the faculty and the students and researchers to better use the repository for their needs. And getting to this idea of how do we close the gap between ease of use and what we would like to see as a picture perfect deposit. So how do we get to between those two different concepts? So when we talk about the repository and why our faculty should decide to use Kiltub over other resources, we kind of have a few different reasons that we can point to them between making it open, simplifying the research workflow, but ensuring that if they do make things open that they are doing so with getting the highest level of impact in return. So we do provide a DOI to everything that we publish in the repository so that they can track their citations and metrics. And this gets along to getting credit for the work and being able to comply then with funders. We still see today that many funders are grappling with the idea of compliance and using repositories to say where things should go. And we see a lot of cases right now where publications are getting really fleshed out and decided upon where things should go. But data is another story. And with some funders they are suggesting the use of fixture which my colleague Ula will talk more about in a minute. But I think one of the last things which we deal with this question of why should we use institutional repositories over public repositories that are out there in the cyberspace, which I think is one of the points is that we are here to help. The libraries can serve as a mediation point and a by proxy assistant for the repository and to help with some of that overhead that dealing with making deposits and making items available. And this alludes then to what comprises the repository team itself which is actually a very dynamic team. You'll notice that there are individuals that are responsible for both scholarly communications, research data management, and other surrounding topics. But also is very much based upon the liaison model where there is a librarian that serves as the conduit between the libraries, the specialists within the libraries, and then the disciplinary faculty. So to reiterate this idea of a comprehensive repository this is both combining what we would call a traditional institutional repository with a data repository. And being able to accommodate research data and what we refer to as scholarly outputs. So these two different categories kind of form the warp and weft of our repository. So again this is how we're weaving the fabric of research is being able to try to accommodate everything one may produce during the research lifecycle. Another point made in the CNI executive round table report was this notion of the enterprise repository. Repository content most of the time when it ends up there it's a terminal point. Once it's been published, once it's ready for dissemination that's when it goes to the repository. But there's this other idea of thinking of the repository as the collaboration point where the research can be developed and maintained. Now we know there are other solutions that institutions may use or researchers may use but how could the repository be used for this additional activity? So this may be including having a collaboration space or having a project housed in the repository. And there's all different types of concerns that we have to think about this as far as with security and storage allocations and things like that. So this is something that we're still trying to grapple with but it's getting towards the idea that if I can narrow the gap between where things are created to where they're disseminated, I can try to ensure that I can get more content made available through the repository. This also adds then to what types of integrations may be possible outside of the repository. And one such integration that I'd like to highlight is the integration that FIGSHARE has with GitHub where we know that many researchers today are using for software and code. And the benefit of this connection is that it is a true integration between GitHub and FIGSHARE where the researcher can authenticate their two different accounts to move content from GitHub over into FIGSHARE, allowing for version control and using FIGSHARE then as a place to publish as well as preserve the materials that a researcher may be producing within their Git repository. The next of these types of integrations include how it does a repository integrate with a research management system or a CRIS. And there are a lot of different repositories that talk to for different ways with CRIS. And there's many different types of CRIS. So this gives you an idea of the repositories that you may see. And right now the most common CRIS is that are out there today. What I'm going to be talking about specifically though is the connection between our FIGSHARE for institutions repository and symplectic elements. Now I should point out that while both of these are from digital science within the portfolio, FIGSHARE is actually the fourth integration for symplectic elements predicated before this was actually connections for D-space, E-prints, and then a connection for a data source harvesting practice from digital commons to elements. So I think there's some notion of that is looking at there's a wide variety of connections that can happen between repositories and CRISs. And this connection is not a single flow, but actually a cyclical flow of information from one to the other for various reasons and for different activities. The first is looking at the repository to the REM, which is an activity of finding what content has already been made openly available, and harvesting that information to match with publication records that one may find in the CRIS. That way, from the perspective of the CRIS, you can see a publication record and then verifying has that thing already been made openly available, and if so, which repository. And then from the REM to the IR, we have the ability to do actual deposits, deposits that we are able to use additional API say from Shroporomio to verify the version of publications that we can add to the repository, inform the user of what that information is with also being able to provide local institutional context over that. So, we don't necessarily have to provide the straight Shroporomio information, but actually the library's interpretation of that information. With any additional information for deposit that we may require, that then can go over to the repository and be involved with our curation profile for the submission process. With these two things connected together, then we're able to monitor open access to see what has been made open access and what is not. One point I do want to make sure that we do stress is that the repository is not a compliance component. We are not using it to say what has been made open access and having any kind of power over that. This is just making faculty aware of what they have in the repository and how it reflects within their overall research and publication view. So, I'm now going to turn it over to my colleague, Ula, to talk more about interacting with faculty. Good afternoon, everybody. My name is Ula Villotson, and I'm a research liaison for cyber security and information systems at Carnegie Mellon University Libraries. I'm also a member of our steering committee for digital science and helping to implement elements and kill tub throughout the campus. I'm going to share with you some of the challenges and lessons learned that I and my colleagues have come across when it comes to working with the faculty and forming them about elements and kill tub. So, there are a lot of great reasons why faculty and researchers who use kill tub for depositing their data sets and their publications. I'll draw your attention first to the two that are at the bottom, making it open and simplifying the research workflow. We can all agree in this room based on our positions and where we currently work that these are very valid and very important reasons why someone might want to use kill tub. But we've found that the two that are at the top, the compliance and discoverability, tend to resonate a little bit more with the faculty, especially depending on how they're viewing things at that particular time in their research lifecycle and their careers. Compliance, the word is getting out. A lot of researchers are now well aware of the mandates from the federal government and from publishers about where they need to deposit their data and discoverability. Figshare and by extension kill tub has a very large footprint on the internet and Figshare itself acknowledges that 60% of traffic that comes to Figshare comes there from Google. I'm going to talk about why that's really important in just a moment. For funder and publisher compliance, here are just a couple of examples of where Figshare is mentioned in both a government NOAA and their requirements for data deposits along with some other repositories as an acceptable place to deposit data. Plus, publisher pointing out that Figshare is also a place where data can be deposited for general purpose studies that don't fall neatly into one of the domain repositories that they recommend. So why is discoverability through a general purpose search engine important? These are a couple of studies that were done in the past few years that take a look at how researchers are finding data sets. They agree on one point that using a general web search engine, which essentially means Google, is one of the top three ways that researchers are looking for data, with the other two being domain repositories and checking in relevant journals. So that drives home that importance of making sure that data sets are discoverable through Google as being one of the top ways that researchers are going to look for that data if they can't find it in their particular domain repository or through the journals that they read for their particular domain. And so I've done a test here where I've taken one of our deposits over the summer into Kiltub. It's a study that included a data set and a number of other artifacts for looking at software ecosystems, culture and breaking change, a survey of values and practices and open source software ecosystems. And I went to Google and I put in four words that I think anybody looking for anything on that particular topic might use in order to conduct their search. And the study that's in our Kiltub repository came up as number one. And I've done this with a number of other deposits that are made to fig share that is very discoverable through Google. So I did, I'm never sure of what the secret sauce is going into Google's search engine. So I controlled for location using a VPN. It thought I was in Kentucky even though I was in CMU's campus. And I use a private browser to control for search history. And it always still comes up the same that it's very high in the search results. So I'm going to switch gears a little bit now and talk about some of the implementation challenges we have had with our research information management system or RIM or Chris. I might interchange with those on campus. The first is curating profiles. So in elements, the faculty's publications are drawn or harvested from a number of data sources like Scopus and Web of Science and a number of others. However, they're not always accurate out of the box when they run their first initial harvesting polls from those sources. And that's for a number of what I would think acceptable reasons. There's name variations at play here where researchers may be confused with other researchers out there in the ecosystem and academia and also incorrect linkages to identifiers. And by that, I mean that they may not identify a particular researcher correctly with their Scopus ID or may associate a researcher with the wrong Scopus ID. Or there may also be more than one Scopus ID as we've come to find for researchers. And they may not be including them all or including the best one. So we've found or decided that we need to really go through and check all of the faculty publication profiles to make sure that they are reasonably accurate before we turn them over to the faculty members. And that's required a lot of time and effort. Rough back of the envelope math would say it's probably between four and five hundred man hours in order to go through and take at least a quick look at every faculty member and for some faculty to dive in much deeper in order to figure out why their results are not close to be inaccurate and figure out what the problem might be. So second, we at CMU have a very decentralized campus. And I think the United States around the time of the Articles of Confederation between the Revolutionary War and the adoption of the Constitution is a not bad example of the way things are at CMU. The colleges hold a lot of autonomy. They hold a lot of control over their decision making. And that has created challenges for us when implementing a common RIM or CRIS across the campus. Challenges, for example, there are different expectations and needs for RIM among the different colleges. There's different annual review processes and forms that each of the colleges use. And there's also different, they have different current approaches to tracking their research output. They might be using different systems altogether from one college to the next, which makes it difficult instead of going from one current system to another new system. We're going from multiple different systems to bringing everybody together on board on one common system. And then finally is technology integration. Taking full advantage of elements and all of its capabilities would mean being able to harvest information from, for example, our current systems that hold grant information systems that hold our teaching information and evaluations and so forth. All of that can be harvested directly into elements automatically, but there may be integration challenges depending on the type of technology that's currently in use. For example, if it's been built internally, it might need some modification or a fair amount of work from developers in order to figure out how we can harvest that and bring it directly into elements. And so for some more lessons learned here that we've had with both Kiltub and elements integrating those onto the campus, first is Google Scholar. I spoke about the great discoverability that we have with Kiltub through Google, but we have not found that to be the case with the Google Scholar. We were expecting better visibility for our research products in Kiltub on Google Scholar, but it's just not there yet. They're not findable through that particular platform that's under investigation. We're working with our partners at Digital Science and Big Share to determine why, but that's something that we hope to improve upon. The second, and these are good problems to have as we, when we rolled out Kiltub earlier this year on a soft lunch, we got some interest from some faculty to deposit their data into Kiltub, which was successfully done, but it also drove home the need to come up with a data submission and deposit requirements policy, so it would be very clear how we were going to handle deposits in the future for data and how, for example, we would negotiate that deposit based on the data that's being submitted and the kind of documentation we'd like to see, such as README files and data dictionaries and so forth. Also, another good problem that we found out we had was a researcher also who was submitting a grant application needed to prepare DMP, a data management plan. The faculty member had heard about Big Share and Kiltub and knew that this was something that would help with the DMP and wanted to get some language from us about how to include that in the data management plan. We were able to meet that need successfully, but it also drove home the need for some boilerplate language about Big Share and Kiltub for faculty members, for their DMPs, so that we could submit that or provide that to them very quickly. It's not uncommon that we might only have a 24-hour turnaround to look at a DMP before it needs to be submitted to the funder, so having that ready and on hand and sent out to the liaisons was another lesson learned. Finally, is the need to balance currents and sticks when marketing the research information management system and the repository across campus. This may come as a surprise, but not all faculty members are enthusiastic about the prospect of sitting down in front of a new system and populating it with all of their information, especially that which may be incorrect or which has not been harvested from an automatic source. In those cases, we found that it's very important to partner with the administration at all levels to make sure there's a clear understanding that we are there to implement these systems that were there to help the faculty members with these systems, but were not there to direct the faculty members to use them. That is within the domain of the administration. To use the analogy that one of my colleagues at CMU uses in this kind of situation that we are there to be H&R block, we're not there to be the IRS. We really are there to help. With that, I'm going to turn it back over to David for concluding remarks. Now that we've had a chance to show you what CMU has been working on for the past year or two, we'll now talk about next steps and what is our future expansion of the ecosystem at CMU. Our immediate next steps are to continue our role at an engagement of Kiltub. We're doing this through our liaison's department by department, in many cases trying to roll out to entire colleges when appropriate. Part of this rollout is also how do we engage with our faculty and students who are using figshare.com. One of the things that we have noted is that we do have a number of faculty and graduate students who have used figshare.com to house things that we would very much like to have in the institutional repository version of Figshare. One of the things that we've been able to do is by working with the vendor and identifying with the user what content would meet our collection development policy and would be applicable for the repository. Having that content migrated from figshare.com into Kiltub without requiring the user to make that deposit themselves or having duplicates both in the .com and the institutional version. We're also now developing our use cases for our deployment of elements. These use cases include using elements as a supporting mechanism for faculty profiles and various documents, CVs, biosketches for grant applications, other types of documentation. Documents that may be supporting the annual review and reporting processes, but also any kind of support that the system can be able to provide for the promotion review and tenure process. Some of the things we're focusing upon for future expansion of the ecosystem is completing the research life cycle support loop. Keith mentioned some of the areas we're supporting right now, what we've already made ventures into, but those are by no means the only things that we need to be able to provide assistance and support throughout the life cycle. Some of the areas we're looking to expand now into is how can we support different activities such as electronic lab notebooks, protocols, and collaborative writing platforms. That would allow for again the researcher to continue what they're doing already in many of the systems, but having institutional support to do so. So with that, thank you very much for sitting here and listening to our presentation. There is some resources and information to give you more information about what we've been doing as well as access to our repository kill tub. So thank you very much.