 Hello everyone and welcome to today's webinar. So today's webinar is the first on a series looking at persistent identifiers that we've called PIDS for short, PIDS Short Bites webinar series. The first one today is on DOIs to support citation of grey literature. The second one is on identifying and linking samples, physical samples with data using the International Geo Sample Number. And the third one is on linking data and publications and that's on the Scholics International Initiative. So today's webinar is on DOIs to support citation of grey literature. I'm Natasha Simons from ANDS and I'm going to start with a brief introduction on persistent identifiers. And I'm then going to hand over to my excellent colleague Dr. Daniel Banger from, he's the Senior Data Librarian in Library Repository Services at the University of New South Wales. Okay, first of all, what's the problem that persistent identifiers are trying to address? Well, I'm sure everyone will be familiar with this particular problem when you click on a web link that takes you either to a 404 page not found error like this one or it takes you to content that's not actually related to the link that you clicked. And both of these things usually happen because the web resource has been moved to another location and you have the old link. The page not found error is frustrating and in the context of research it's disastrous because it means that a scholarly resource which may have been cited cannot be found, verified, potentially cited again and so forth. And this is the problem which persistent identifiers are trying to address. So persistent identifier is simply a long lasting reference to a digital resource. Even if the resource moves location on the web, the persistent identifier is there to make sure the link always resolves. So if a PID is used as a citation link in scholarly literature, it will always resolve to information about the resource, either a descriptive metadata page, the resource itself or information about the removal of the resource from the web. PIDs are key to facilitating the discovery of scholarly resources and play a key role in linking scholarly resources, for example publications and data, as well as tracking the impact of these resources. But it's important to note that PIDs do not guarantee a link will never be broken, but what they do is create a framework which helps to guarantee it. So PIDs have evolved quite a lot over the last 20 or so years. This slide is taken from Jonathan Clark's presentation at the Thor webinar last week and he notes that now we have identifiers for people as well, we want to know what persistence means and how long the PID will last. Data has grown so there is a lot more value in retrieving the metadata as much as retrieving the object itself. And that object may no longer be digital because you can refer to digital information on a physical object which is a big growth area and we're looking at that in the IGSN webinar coming up. And last but not least we want our machines to be able to interpret PIDs. So in this webinar series we hope to explore more of these topics in more detail. What PIDs apply to research data is a very good question. There are many different types of persistent identifiers that apply to research data. I put on the screen some common examples that ANS actively promotes or provides a service for such as handles for identifying data, DOIs for citing data and related materials or orchids for people identifiers and really so many more. And all of these persistent identifiers schemes differ in some way. For example, they might have a different purpose. Some apply generally to all scholarly resource types. Some are discipline specific. The underlying technology differs between persistent identifiers as does the governance structure. For example, some are non-profit, some are company driven and metadata is collected. Some require more metadata than others and also in the extent of use. So PIDs vary in their uptake. If you'd like to know more about persistent identifiers there is a PID guide on the ANS website and there's a lot of information about our DOI and handle service as well. There's also this short bites webinar series and I highly recommend the Thor webinar series on PIDs. The first one happened last week and it was a general introductory one and that's been recorded and they are making the recording available by the end of this week. And then there are another two coming up on that series. So if you'd like to register you can click on those links. So I'd like to finish now and hand over to my colleague, Dr. Daniel Bangett. Okay, thank you Natasha. Thank you for joining this webinar. Thanks Natasha for the invite. Today I'll be talking through a service that we implemented at UNSW library to support the citation of great literature held in our repositories. Today this is based on a presentation that I gave late last year for the call research repositories community days and that presentation, the slides and video can be found at the link on your screen. I'm going to briefly cover digital object identifiers and say a few words about what they are then take you through the environmental scan that we did to design our service and then some of the details of the UNSW DOI service including the conditions around DOI assignment, the workflows that we're following and integration with orchid identifiers. And a few words in conclusion, a DOI, a digital object identifier is a type of PID that is optimized for scholarly resources. Importantly, it's the identifier that is digital and the object can be digital or physical. DOIs are assigned to an object by the publisher or a long term custodian and the persistence of that identifier and the resource is managed by the organization and its policies. There are a few facets to a DOI. We can start with the DOI name itself which is an alphanumeric string and that can be converted to a URL by adding a DOI resolver like DOI.org. When that URL is entered into a browser, it takes you to a landing page with human readable metadata about the resource, about the object. So basic information about the resource is required to mint a DOI and that metadata is both human and machine readable. So why are DOIs important? They've emerged as a relatively simple but powerful piece of technical infrastructure in improving scholarly communication. They make it easier for outputs to be discovered and used by others and to be cited and measured for impact. A useful way to think about DOIs is as a trusted identifier which is a term introduced a few years ago by a project called Odin, the Orchid and Data Site Interoperability Network. That's the predecessor project to Thor that Natasha mentioned at the beginning. This term captures a set of characteristics that trusted identifiers are unique. So they're unique on a global scale. They resolve as HTTP URIs persistently. They're descriptive. So they come with metadata that describe their most relevant properties. For instance, there's a mandatory set of metadata elements like creators, title, publisher, publication, year, resource type. And then you can add recommended or optional elements like alternate identifiers, subjects, dates, rights, information, description and so on. And lastly, trusted identifiers are governed. So they're issued and managed by an organization that has a sustainable business model and it's managed by that body which is usually a publisher or custodian. You can read more about trusted identifiers at the link below. When we were looking at designing a service, the impetus for this came from requests from academics. Most commonly they had great literature like a series of reports that they wanted to assign DOIs to. And we were also able to implement something based on the ANS site-wide data service. In April last year, 2016, that was extended to account for grey literature. So we were looking at the possibilities of implementing something here at UNSW. In preparation for an options paper, we looked at grey literature and DOI assignment in several repositories, whether institutional, disciplinary or national. We also looked at options for registration agencies and the resource types that we would cover. A few things that we found that might be useful were a project conducted in the UK called Unlocking Thesis Data. It's a disk-funded project and led by the universities of East London and Southampton as well as ETHOS, the National Thesis Service at the British Library. And they have a number of reports and case studies where they outline options for the workflow to assign DOIs to theses. Another idea that we eventually incorporated into our own service was from the University of Southampton and they have a role called a trusted partner. And that allows certain staff, academics or faculty administrators, to authorise their own DOIs or the DOIs of a research group. And I'll come back to that idea later in the presentation. So in the latter half of last year we presented an options paper to the library and went ahead with a pilot which involved a manual workflow for a certain resource type reports to start with. And we had workflows for both library staff and trusted partners to mint DOIs. We then moved on to implement a web tool, which I'll show you later. And at the link on your screen you can look at the DOIs minted by that service. I think now we have about 330 DOIs minted for grey literature. It was important at the outset to think about the conditions around DOI assignment. And the first one is that the resource is deposited in a UNSW library repository. Our institutional repository called UNSWirks holds a large amount of the grey literature created by UNSW staff. And resources in the repository are managed in accordance with the UNSWirks digital preservation policy. So for that repository we have governance, we have preservation procedures in place and we're then able to sign the DOI and then potentially if the resources move or the repository moves then we can make sure those DOIs continue to resolve. The second one is that it's an eligible resource type and it needs to be within a certain set of grey literature that we've defined. There should be no existing DOI for the service as that defeats the purpose of a unique identifier, there should be no existing DOI request and it needs to meet the mandatory metadata requirements set by the AND service which links to data sites. So in the user interface as you'll see later the requester is given these set of conditions which they need to agree to before they submit. So they need to agree that they're an author or creator of the resource or have authorization from an author to request a DOI. The resource doesn't already have a DOI. They don't plan to mint a DOI using a different service. The resource is unpublished or published by UNSW. A library repository is the primary publication point for this resource meaning that when people resolve the DOI they'll be taken to the repository page, the landing page. The resource is not subject to a permanent embargo and the resource is not likely to change significantly. That's just flagging that major changes like anything that would be part of a citation shouldn't be changed and that would require a new DOI. This is the workflows that we're following. So for all users we allow them to request a DOI, they go into the tool, they select their repository where the resource is, most commonly UNSW works. They search for their record, they select it. The system checks if there's a DOI existing already or if there's an existing request. It also checks if the mandatory metadata is already held by the system. If not, they need to enter or confirm the metadata and then submit a request. The second part of this is for a DOI service administrator which is currently library staff. They go back in and review any request that has been submitted. They check that it meets the conditions that we've already outlined. If it does, that's approved. It goes to the AND service, meets the DOI and comes back and emails the requester with the DOI. The administrator then updates the metadata and then that information is sent back into the repository and is displayed on the repository page. For the trusted partners, so these are faculty administrators or researchers that we allow to mint DOIs directly and that needs to be approved by a relevant authority like the head of school or associate dean and we give those people training and access to the tools that they need. So they follow a very similar process except that instead of requesting, they're able to mint the DOI directly. So they select their record, they mint, do the administration and the cycle is complete. Back to the UNS works page, the institutional repository. If you then resolve the DOI, it takes you to that landing page and the DOI itself is displayed in the record details as part of the metadata about the publication. We are also aiming where possible to include ORCID IDs, so identifiers for researchers and contributors to research outputs to be included in the DOI metadata. The way we do that at UNSW is through our research output system. Users can link their ORCID profile within that system. That ORCID ID is then pushed to the repository if they deposit full text. Then they can go back into the DOI service, select that record, submit a request and that DOI gets put back into the research output system. There is then an update. So both the DOI and the ORCID go into the repository and both of those, the ORCID ID and DOI can be exposed via external harvesters like Trove and aggregators as well as through the ORCID profile because of the connection between ORCID and data site that can be easily claimed through data site and added to the user's ORCID profile. So I have a short video here which I'll take you through. This is showing an early release. So this is the version that was available at the end of last year. There have been some minor modifications since then but it gives you a sense of what the service looks like and how those workflows actually look in practice. So there's a login screen that uses the usual credentials. So we'll start with the requested DOI workflow. Here the repository can be selected and there's a search box to pull in the information from the repository. So the user selects the record and they're given a preview of the metadata. So what we show here is the mandatory metadata for assigning a DOI and if any of that is missing, they're given the opportunity to add it. They see those conditions for requesting and they submit the request. There's a confirmation message and they also have a list of their requests and they can see the status whether that's pending or whether the DOI has been minted or declined. The next step is for the DOI service administrator to log into the system. They have access to a tab called review where they can see the pending requests. They can then review each request and the metadata. Based on that information, they can either decline or mint. If they choose to decline, there's an option to send a personalized message and to follow up with the requester. If they mint, then that request goes to the and service and comes back with the DOI and then this is emailed back to the requester. So they're given the DOI immediately. The administrator then updates the metadata in the system. You can see that the DOI is active immediately and that's the end of that process. And the last part to show you is the mint function for the trusted partners. And this just means that for people that have high volumes of publications, great literature that they need to assign DOIs or an ongoing series, that they're given the option of actually doing that directly and having responsibility for the whole process. The procedure is much the same. They can search for the record. They select the record, review the metadata and then they complete the minting process immediately. Okay, that's a repetition of before. So I'll just skip through the rest of this. Okay, so in conclusion, the UNSW DOI service was designed to meet existing and future use cases. So it's flexible and scalable with future cases in mind. A priority for us was ease of use. So we're reusing metadata where possible. So anything that we hold in the repository that we need for the DOI metadata, we use that, reuse that metadata, which is reviewed. It integrates with existing workflows. For instance, with the research output system, with the repository itself, and it connects with other PIDs and platforms like Orchids. There are conditions set around it. So we ensure that the identifier is governed correctly, that the resources remain persistent and that the link can continue to be resolved and be a citable enduring part of the scholarly record. So that's handled by preservation policies, by the reviewing process, and the ability to track our DOI requests. Okay, thank you very much. There are a couple of links at the end of the slide there, to both the slides and video. If you're interested in the software itself, we've made the code available, and both of those, of course, have DOIs to access. Thank you very much, Jan. Daniel, that was a brilliant presentation. So I suppose what your presentation shows is that there's quite a bit of thinking involved in assigning DOIs to Gray Literature in terms of what DOIs, you know, what they should be assigned to and how it should be done. Can you just tell us how you got that thinking process started at UNSW? Sure. So primarily, it was based around the infrastructure we already have in place and the policies that already govern our repository material. So we knew there were conditions about what was in the repository and what we could govern. So that was a starting point. Then we wanted to cater for the greatest use cases and make the most impact. So we wanted to start with resource types that were requested by the community. Then in negotiating the actual conditions, that was partly based on the ANS guidelines, so making sure they actually fit in with what ANS requires, the agreement that we have with ANS and also what data site considers best practice. And from there, it was basically a process of just testing those and whether that could be worked into the existing workflows and implemented efficiently. Okay, thanks Daniel. There's a couple of questions here. A question from Gillian Elliott is what happens to the handle associated with the thesis used in this example? So the handle will still resolve and can be used as it usually would. The way we've implemented the DOIs is by resolving to the handle. So there are a number of different ways to do that, but for us, it's best if the resolving URL for the DOI is the handle itself. That ensures that if we ever migrate, the handles would be migrated as well and therefore the DOIs. Okay, thank you. Question from Julie Gardner is are you able to determine how often these DOIs have been cited? I believe there is some event tracking or data around events being implemented by agencies like Crossref and DataSite. One way to do it is we also have automatic implemented at the institution. So mentions that include the DOI can be tracked. So that's one advantage. And then through DataSite itself, I think that would be the best way to have a look at what kinds of events are happening with particular DOIs. Okay, thanks very much. Well, those are the end of our question. I'd like to thank Daniel very much for his presentation today. Thank you all. Bye.