 Okay, welcome to the webinar. I'm Natasha Simons and I'm from the Australian National Data Service and I'm based in Brisbane and I'm here at QUT with QUT's very own Paula Callan who's the scholarly communications librarian here. So first of all communication during the webinar. During the webinar your microphones will be muted during the presentation to minimize noise and if you have a question please type it into the question box and then we'll have a look at questions at the end of both of the presentations. Also this webinar is being recorded and we'll email you when the recording is available from the ANS YouTube channel and if you'd like to tweet during the webinar please feel free to use that hashtag which is hashtag ANS data in your tweets. ANS is a federally funded organization that's working to transform Australia's research data environment. To make Australian research data collections more valuable by managing, connecting, enabling discovery and supporting their reuse of data. So if you haven't seen it before please take a look at the ANS flagship service which is Research Data Australia. RDA is an online service that's designed to provide connections between research data projects researchers and institutions and promote the visibility of Australia's research data collections in search engines. So many of you will know about ANS and Research Data Australia however some some won't and it's being mentioned here because it will be referred to during the course of today's webinar. This webinar is jointly sponsored by ANS and the Council of Australian University librarians or CALL and it's the first of a series of jointly sponsored ANS CALL webinars to explore issues of common interest. So today's topic is joining the dots, connecting publications, grants, data and other scholarly outputs. A more connected and integrated scholarly record facilitates discovery of research, provides a more complete view of research activities and assists the transparency and reproducibility of research. So in this webinar we have two speakers Paula Callan will discuss the guide to tagging institutional repository records related to ARC and HMRC grants which outline strategies to link research publications with grants and then Monica Omaday from ANS will be discussing the linking of publications and grants with research data and other scholarly outputs. So without further ado I'll hand over to Paula. For those of you who don't know me I'm the Scholarly Communications Librarian at QUT and I've been in this role since 2003 although it did have various titles along the way. I have oversight over QUT's Open Access Publications Repository QTE Prince and also a Scholarly Communications Librarian. It's my role to help researchers with a variety of issues related to their publishing and of course included in that is complying with the ARC and NHMRC Open Access mandates. In this presentation I'm going to look at very briefly the nuts and bolts of the ARC and NHMRC policies on open access and then discuss the implications for researchers and institutions and then I'll look at a couple of guides to tagging grant related records including the call guide that I was involved in putting together. Over a year ago now so probably some things have changed slightly so I was really pleased to see that Trove the National Library people have actually put together a guide to the technical side of tagging the grant related publications in our repository so that they can be harvested effectively by the National Library and then they've asked me to talk briefly about some of the grant publications, should I say, Fundamentated Publications issues that came up at Open Repositories a couple of weeks ago in Helsinki and I was there so I'll mention a couple of issues from that. Starting with the ARC and NHMRC policies, looking at the rationale for both policies it's basically it aligns with the government's stated reason for investing in research in the first place and that is to support its role in improving the well-being of our society and to do that they realize that they need to disseminate the research results to the wider community so it's not just a conversation happening between researchers it needs to get out there to the wider community and researchers who are outside of the loop. In a nutshell the policy is that the publication details, so that's the metadata, must be submitted to the institutional repository as soon as possible after the paper has been accepted for publication and the manuscript must be submitted to the repository as soon as possible after the publication date. An open access version of the paper must be available within 12 months of the publication date or as soon as possible after that date and if an open access copy will never be available then this information must be provided in the final report so they must provide an explanation for why it was not possible. And there are slight differences of course between the ARC and NHMRC policies so the ARC policy applies to grants that were awarded after the first of January 2013 but it applies to all publication types so not just journal articles. Whereas the NHMRC policy applies just to journal publications but it applies to all journal publications published after the first of July 2012 regardless of the start date of the grant. Now when we look at who's responsible it says it's a chief investigator who is responsible for providing the metadata and the manuscript to the institutional repository. So it's not our responsibility to go and get it they should be providing it. But it is the administering institution that's responsible for compliance so it's our responsibility for facilitating that perhaps reminding them and helping them with it but we have to make sure that they understand that they have to share this responsibility so the researchers. Now I was looking at the timing of deposit or the information about the timing of deposit on the NHMRC website because the wording in the policy infers that the metadata should be deposited in the repository on acceptance but in fact when we look at the guide for authors on the NHMRC website there is a little bit of regal room in terms of the precise timing so there they say that the metadata for all journal articles must be submitted to your institutional repository immediately upon publication. So it's clear that while the ideal would be it's deposited on acceptance I think we can accept that it may not be exactly at that date it may be as near as possible to that date as we can get it. Now the implications for researchers is that they really need to plan for open access right from the beginning. It's too late when they've actually published the article in a journal that's not going to support open access so they need to think about it upfront and that means actually at the application stage now because there is a section in the application called communication of results where there are two outline plans for communicating the research results to other researchers and the broader community. This is something that should be flagging it for them that it is an issue they need to deal with and so they need to consider their options at this stage so that they can put it in the application. They also need to acknowledge the grant on the publication so when they're submitting the manuscript to a journal they need to put the name of the funder and the grant ID. Some publishers are actually offering a taxonomy for a controlled vocabulary of funder names but if they don't we should be encouraging researchers to use a standard form of the funder's name and fundref is a taxonomy that's readily available so it's been created by the crossref people and so they can go in there type in a funder name and it will give them the standard form of that name to use when acknowledging the grant on the publication. Another obligation for the researchers is to provide the institutional repository with the publication details and the grant information and also a copy of the accepted manuscript version unless the paper was published in an open access source. Now some of the practical implications for the administering institutions are that we need to develop an alerting workflow so that we can have and identify the grant related publications because it's actually not that easy. It's easy to get a list of the grants and we have a list of publications but this is where we're trying to join those dots. Try to identify which publications are grant related and which grant because many of our researchers have got multiple grants and some of the publications apply to one or not another. And so of course the I the institutional repository needs to have a metadata field to actually store the grant ID. And also we may need to review our policies to make sure that they support all the activities related to compliance. In that I mean does your institution have a policy that requires researchers to engage with this process. Also maybe the institution could consider supporting open access publishing in some form whether that's coming from library funds, institutional funds, grant funds, but maybe there are things that we can do to be facilitating that. And then also we need to be checking the final reports for compliance. I think we also have a responsibility to communicate the policy requirements and the options for compliance so that researchers know what they need to do and what are their options. And the Australian Open Access Support Group has created some really useful graphics that we can use as part of that communication plan to send out perhaps two researchers when we hear they've got a new grant. At the application stage perhaps we can provide some assistance such as providing some generic text that they could customize on what they could put in that section of communicating to the wider community. And possibly even identify a list of publications, relevant publications in their area that would be compliant with the grant rules. Also we may start getting questions from researchers about how to answer the questions that are on the research grant management system. So a lot of the compliance is built into that management system when they're submitting the final report. They have to actually tick some checkboxes and they probably need to understand what's the question and which is the right answer relevant related to what they have done. I'll move on now to the guides to tagging grant related records. And so the link to the call guide is on the call website and the URL for this page is on the screen here. If you haven't seen it, then I think this presentation will be available afterwards and you'll be able to get the link from there. Or just Google call and then you should be able to navigate your way to this page. I'm not going to go through it in detail. I just mentioned some of the main points. The main points are that in the technical preparation we need to make sure that the name of the funder and the grant ID is actually in the metadata, but that it is output in the form of a Perl, so a persistent URL. And the example is here on the screen. It can be done in a number of ways and I'll mention that in a moment. But this is basically what it should look like on the repository record. This is an ARC grant with the NHMRC. Often they don't have other characters at the beginning of the number. It's just a numerical ID. And importantly, the information must be stored in the DC relation field. And that is to enable it to be harvested by the aggregators such as the National Library and other search services. At QT we've created a couple of fields. I've got those on the screen here. So a field for the funding body and then a field for the grant ID. Now at the moment, this information is entered manually, but it's also possible to do this as a lookup. So if you're using the lookup widget created by ANS. So they've got this research grant API where you can have a lookup and people can actually search for the grant and then select it from the list and it will go into the metadata field. If we'd known about that when we were doing this, we probably would have gone down that route and we still may do in the future. And so the poll actually links to the research data Australia activity record for the grant. So ANS have very helpfully created pages for all of the ARC and NHMRC grants. And so when we create the poll, it will resolve to this page and that gives people a lot more information about the grant. And it means we don't have to store that information in our repositories. So ideally the chief investigator will provide this information when the publication details are deposited in the repository. If the work's been published in an open access journal, then they only need to supply the publication details because we can just link to the open access copy. However, many repositories do choose to store an open access copy in the repository as well, just in case the journal disappears just as a backup. Moving on to the new guide to tag in created by the National Library on the Trove Help Center web page. And this is, it's a technical guide. So it talks about how to store the metadata in the DC relation field. And so the link to the guide is on the screen now. And this is an excerpt from the guide. So there is a section on ARC grants and another on NHMRC grants. So it talks about how the form that it needs to appear in the metadata. The reason DC relation was chosen is that it's a repeatable field. So you can have a number of DC relation fields in your metadata record and that can accommodate multiple grants and other things that you may be using that field for. The other useful resource that Trove has created is the Profiler. So again, it's available from the Trove Help Center and you can set filters. So in this example here, one filter is NHMRC. So this is show me all records that are tagged with the NHMRC tag and then limit it to your institution and then display. It will actually show how many records from your institution Trove has been able to identify with the tag and it graphs it across the years. That's a really useful visualization you can use when communicating with perhaps your office of research about what needs to be done. Moving on now to the papers presented at Open Repositories that were related to Fundament Fundates. The first one I'll mention is a paper by Peter Millington from Sherpa and his paper was called Making Sherpa Fact Local. But what I want to talk about is actually Sherpa Fact. So Fact is an acronym for Funders and Authors Compliance Tool and it's been developed to help UK researchers check whether or not a particular journal is compliant with their funder. And so you'll see that in the record all the main UK research councils are represented and so they can actually tick the appropriate one and then do the search and it will actually tell them whether or not it first of all gives the details of what are the options when publishing and if you have a grant from that particular funder. So in this case the funder is an natural environment research council and it gives them the summary of the requirements of that funder. Now what Peter was actually talking about was their new API which is a tool to enable institutions to customize the Sherpa Fact resource so they can actually embed it on their webpages but customized with local policies as well and also give it local branding. That's not particularly relevant in our situation because it doesn't actually mean customize it to the extent of including the ARC and NHMRC policies however call is currently discussing with Sherpa the possibility of including ARC and NHMRC and possibly other major Australian funders in the Fact tool. So watch this space it could be really really useful. Another paper was by Europe PubMed Central and they were talking about how they get their articles that are funded by the Welcome Trust into Europe PubMed Central. So it's by a variety of means sometimes the publishers make all of their content available so that's the open access journals. Then there are the hybrid journals that make individual articles, open access and then some authors self archive directly to Europe PubMed Central. Sometimes the publishers are actually supplying the accepted manuscripts to PubMed Central but the authors need to sign off on the deposit and they also spoke about repository junction which is a broker that will actually push the publisher supplied accepted manuscripts to the relevant institutional repositories. So it's only at the planning stage at this stage but it does sound like a useful development. There was actually a workshop called Dealing with Fundament dates practical support for repository practitioners by Dominic Tate from the University of Edinburgh. But to be honest there wasn't really a lot in there for Australia because it mainly focused on the UK Research Councils and talked about their requirements and what the UK institutions are doing. But what was interesting is that Dominic mentioned the compliance expectations of the RC UK funders and you can see that they're not expecting even 75% compliance until 2017 or 18. So that was reassuring for us. Then there were a number of papers on orchid and a lot of the rationale that institutions were using for implementing orchid included reporting requirements to funders and they were saying that orchid will be the glue that will join all the services together. And finally this was my version of joining the dots. So I looked at which of the dots we are joining now. And so our institutional repository records are discoverable via Google. So there's a dotted line to the repository record and also to our staff profile pages which they themselves are linked to the repository record. The repository record links to the accepted manuscript version which hopefully will be open access but also the published version which may only be available to subscribers. But the crossref which is again the crossref people have come up with a search interface where you can search by research funder and it will list all the publications that the publishers have identified as having that funders acknowledgement on the paper. So there is a link there. The pearl in the record will link to the trove, sorry will link to the ans record for the grant and also is being harvested by Trove. And our records in some cases are linked to our data repository where there is a data repository record for the data set. And in some cases also to our software finder which is a software repository for research related software. And these of course are harvested by research data Australia. So I can see that where we have a plan for joining the dots. The biggest dots we need to join now is identifying all of those publications related to grants that we don't know about. That's where I shall finish up now and hand back to Natasha. Thank you Paula. Going to hand over to the ans Canberra office. For those of you who don't know me, I work for ans in my current role only since the start of this year, but I had worked for ans some time in the past, mostly around the infrastructure, architecture and particularly in relation to supporting connections. So things like party identifiers and grant identifiers, etc. So this is all coming back to haunt me. And this presentation, I want to look at some of the mixing links we have in relation to the context of research outputs and how we might address that, at least partially, through better identification of research funders and research grants. Just a couple of illustrative examples to get us sort of thinking about it. Here's an open access journal article that's based on a long longitudinal study that's been running for about 20 years. And this was funded from multiple sources, including the NH and MRC. However, to know this, you actually have to read the article. There's nothing as you see here in the metadata that says any of this. And inside the article, the content says this work was supported by the National Health and Mental Research Council, Channel 7 Children's Research Foundation, the National Heart Foundation, Northern Territory Government Research Innovation Grant. And this particular entry is from the Charles Darwin University Institutional Repository, but of course is also published by the publisher Biomed Central Journal with similar metadata. But in fact, the NH and MRC grant that partially funded it does exist, does have a consistent identifier and does resolve to an entry in research data Australia, which gives a lot more detail about this research grant. But the article doesn't link to it. The metadata could have included this identifier as Paul has already identified. And then one could link through to the research grant and even pull into the metadata, into the display in the discovery services, Paul, for example, a title of the research grant or some other fields about the research grant, actually pulled by an API into a description of the article. Now, and at the moment only has research grants in its registry from the ARC and NH and MRC. But we are now pushing on to try and gather grant information from a lot more funders so that people can link to other research grants from other funders. Additionally, for this particular article, if you look through it, it's obvious that the conclusions are drawn from a significant and no doubt costly data collection process over several years, which could be of value to other studies and even other disciplines. But there's no reference in the article metadata to the data set as a research output in its own right. And it appears the data is not being published. It could have been, but I couldn't discover it and published with a full description of the collection methods, data structure, access guidelines, licensing, etc. However, a Google search returns a lot of references to this longitudinal study with contact information, but absolutely no statement about availability. So these are the missing links that we need to address. Another slightly different example from the University of Queensland Institutional Repository does have a grant identifier in its metadata, but it's a local identifier and, of course, doesn't leak through all the results to anything. The Google search with the title of this article actually pulled up a PDF document from the National Breast Cancer Foundation, which actually lists all their research award grant awards. This one here, which I've highlighted, is obviously the grant referred to by the ID CG-12-07 in the article metadata. So what we need is a globally unique identifier for funders, such as the National Breast Cancer Foundation, and also for the individual grants that those funders award if they can be made available through some sort of central aggregation. Then the article metadata could have linked to a grant identifier which resolves to the details about this research award. So that's what Anne's hopes to work towards in collaboration with Australian funders, with Crossref, with publishers, and with international partners in the Research Data Alliance. And finally, another type of missing link we see here, a data collection published in Research Data Australia. It has good metadata, but no link to any other material which might provide even more information about the data, how it was collected, details about individual data fields, et cetera. The description mentions that it was funded by the NH and MRC, but there's no link in the connections area to the actual research grant, which is in RDA. But even within RDA, a data set doesn't link to the research grant because the harvesting of this data from the institutions doesn't contain that link. They don't have that link in their institutional data repositories in all cases. But another Google search found an article which obviously does describe this data set. So it would be really nice that this article could have been linked to in the data set metadata in a related information field. So the people could they wanted to know more about the data. If they wanted to reuse it, they could have gone to this article and found out a little straight to it and found out a lot more about the data. There's a lot of problems to be addressed here different ways of doing it. But what I'm going to talk about is just under identification and grant identification because that will partially solve some problems. And journals invariably require authors to acknowledge funding when submitting an article for publication. In some cases, a pick list of funders is provided in the submission form, usually from the funderfdata base API. And they are also asked to supply the grant ID. However, in most cases, the metadata field for funding information is a free text input box, both for the funder and for the grant ID. We see that some publishers or journals do provide a drop down for the funder. So they use a funder tax on me. But even there, the grant ID is just a free text input box. Usually they use the correct ID, but there's no way for them to check that it is correct. If a global database of funders is supported by the publishing, research and library sector, then article and data set metadata can refer to a standard global identifier for the funder. Available by a lookup widgets when the author is submitting their article. And then searches for articles and data sets can be filtered by the funder as they would all be using the same funder taxonomy and reporting to funders on research output would be easier and more information about the funder could be a click away when looking at the article or data set. And then moving to grant identification a globally unique persistent identifier for a research grant which resolves to details about the grant is is valuable context with a publication or data set being described. The grant ID which is just free text and not resolvable will not provide this information in a single click or enable discovery services to access an API API or web service so that details can be included in their own discovery displays. If publications additionally and another benefit to this is if publications and data sets refer to the same research grant identifier then connections can be inferred even when not explicitly stated but the metadata repositories for both must use the same form of the grant identifier. And this will also enable reporting of outputs against specific research grants. Am I just sort of talking blue sky here or can this vision is it all feasible? The good thing is that there's already a starting point in that fund ref a project of Crossref provides a standard way to report funding sources for published scholarly research. It has a standard tax on a mere more than 6,000 funder names with some limited information about the funder. Some journals provide a submission process for the authors which has this control list of funders to select from and then they're prompted to enter the grant ID as free text. This funding information is then made publicly available through Crossref's own search interfaces through their API and through the Crossmark service. The Crossref API can be used both to control the input of funder names user interface for submission and to also display those funder names and grant numbers in display services discovery services used by institutional repositories discipline repositories research management systems etc. So this is as Paul also I think had a slide if you use the funder as search you can just put Australian funder name not just Australian Research Council they have many other Australian funders taxonomy 357 volume big stack but a search on Australian Research Council will provide those hits and you can see there's such a small number of articles that it's obvious that only a very small percentage of them have actually identified using the controlled list Australian Research Council's funder. Many many many others will have that in their acknowledgements but they're not able to be drawn out through the search interface. When Paula mentioned you know having an alert service which is very useful for institutions to be in libraries to be alerted when there is a publication that's been funded by a grant this is actually possible because the Crossref does have a harvest incremental harvest service for publications which we could use to build an alert service. The issue at the moment is is that currently there's no filter to subset that harvesting process by a funder but we've requested that ability so watch that space because this is feasible to set up such an alert service. Crossmark identification service which is another Crossref service is sent a signal to researchers that publishers are committed to maintaining content. So when you go into when you follow a DOI for a published article and they've used the Crossmark information then this pops up and you can see that if the metadata contains funding information it's displayed in the record tab. So here you see what's echoed what's been submitted to Crossref when within the article metadata when requesting a DOI but the grant numbers as you see we just have numbers because that's all it is free text. The current Fundref database isn't complete but the coverage is quite extensive and they have processes available to match funders against the database in order to add more funders but we need a coordinated national approach and the ANS registry which already has the infrastructure to record parties related to research as well as established machine services is probably a good candidate home and would need to integrate with the Fundref database and record the Fundref identifier. The DOI, the identifiers that Fundref assign to each funder is actually a DOI and we need to just if everyone uses this identifier for the fund that we have an integrated global system then it's possible to download the complete Fundref database from the website and I did so wrote a script to extract all the Australian funders into a spreadsheet which I've made available and so we're also liaising with our counterparts in the UK, Europe and the US, Canada, etc to ensure we have consensus on this because we don't want to go ahead and use the Fundref database of other people decide that they need to set up a different scheme. What can we do about grant information? It's a little bit more difficult because gathering the data is probably going to be hard slog. Fundref doesn't support a grants database currently so we've initiated discussions with Crossref and with selected publishers to investigate their plans their future plans and their interest in also having lookup services for grant IDs and their submission interfaces. For example, they could use the ANS Research Grant API already to supply a control list of grant IDs if the fund is the ARC or the NHMRC. So this would lead to improve cleanliness of grant numbers and would also mean that the proper identifier is included so that it can be resolvable to detailed information display. And this detailed information would then be included in Crossref services and hence in all the discovery services that we have for publications and data. The grants databases, we only fund a small fraction of research internationally and we have some international initiatives which I've listed that are also aiming to do or have already done what ANS has done with having a grant database but we need now to aggregate data from various funders in Australia into our registry and that's probably the hard work. And we need publishers to work with participating journals to enable the control submission of funder and grant information. They need to encourage the journals, they need to pass that information to Crossref and they need to enhance their own discovery and display services to include that information. There's a list of publishers back there of participating publishers in Fundref at the moment. And there's also a need to influence the manuscript tracking system vendors because they're the ones who have the user interfaces where researchers submit their manuscripts for review and it's there in the submission guidelines and in the systems that we need to provide an easy way for them to enter this information by look up. And the same is true for libraries and institutions, their institutional repository systems I need to also provide easy user interfaces for entering this information. And so this was just a quick example of how it might look, you know, a UI that I mopped up where instead of the free text box acknowledgement box, you have something like, you know, select a funder, there's a drop down. If you know the grant ID, type it in and then click look up to make sure that you have it, that it's right. If you don't know the ID, type in a few keywords and click search. Click search, you add to put in a query, you put in some words, you get the results, you click the one that's the grant, link it to the publication. Similarly, you just type in the grant, click look up ID, and then it echoes that that ground refers to that, is it the right one? And I've just left at the end some links that you might want to follow.