 Welcome everyone. It's Karen Vissi speaking here. Thank you very much for all coming into this awkward webinar. I'll be presenting on behalf of Natasha Simons, who wasn't able to make it at the last minute today. Today's webinar is sponsored by ANZ, the Australian National Data Service, and called the Council of Australian University Librarians because this is something of joint interest to both organisations. So a bit more about awkward. I'm guessing that since you're here most of you know a little bit about awkward, but I'll just fill in the gaps for a couple of people who may not be aware of the background. Awkward aims to address the problem of unique identification of researchers by giving individuals a globally unique identification number that lasts over time. Some of the world's largest publishers, funders and institutions have adopted awkward and community uptake has dramatically increased over the past year. On the 29th of July this year, ANZ and CORE co-sponsored a national roundtable on awkward to reflect on the progress of awkward internationally, share international perspectives on awkward, discuss international institutional views on awkward adoption and facilitate practical measures that can meet institutional needs. And the presentations for that workshop are on the ANZ website that you can see there on the third link. Today we've got three wonderful presenters who will share their reflections on the roundtable and also present their institution's perspective on the awkward opportunity. Our presenters today are Heather Todd from the University of Queensland, Nathaniel Lewis from the University of Sydney and Simon Porter from the University of Melbourne. Ryan Flarty from the University of Auckland is also ill today, must be a bad day, but his presentation is up on the ANZ website. So our first presenter for today is Heather. Heather Todd is Director of Scholarly Publishing and Digitisation Service at the University of Queensland Library, so we see a campus in Brisbane, which is where we're talking to you from today. I'm just going to hand over to Heather now so that Heather can start her presentation. Okay, well thank you for joining me. It's quite strange for me to present this without actually seeing faces in the room. But anyway, we'll see how you go. So I'm going to talk about the awkward experience at the University of Queensland. So we've been very interested in awkward for a while, and one of our staff members has been appointed as an awkward ambassador, which basically allows her to keep up to date and receive quite a lot of information for awkward. And we do promote awkward whenever possible, for instance in training classes, discussions with researchers. And if anybody sort of emails us and asks us a question about how to sort of what affiliation they should put on their paper, we always take the opportunity to tell them, you know, and whatever you do, don't forget to register for an orchid. And we've just signed up for an institution or subscription because we plan to roll out orchids through the organisation. The library manages the institution or repository, which we call UQE Space, and we do have an identity management aspect of that, but it's rather static because we haven't made use of the API orchid APIs. And users can manage their different identities in a preferences page, which I'll show you in a screen or so. So what we've done is we've allowed academics to put in their online identities, and then with a single URL, they can have all of those identities linked. So I'll just show you a couple of pages. So this screenshot here is our preference page on our repository, and you have to be logged in to do this. And you can see that there's an opportunity to manually enter four author identities, People Australia, Scopus, Orchid, and Google Scholar. You might say, where's researcher ID? Well, because we sort of have a different implementation for that, we automatically add that researcher ID, we automatically add the researcher ID to staff members' profiles. So once staff have actually entered in their author identifiers, they can log in, and if they go to their My Publications, you can actually see on the top right-hand corner for this researcher that they've got all of their online identities listed. So if they click on any of them, they go to their researcher ID account or their Orchid account. So we've made that possible, and we've got about just over 200 Orchids in the system at the moment. One of the things we're asked to talk about was our business drivers, and as I said before, the library does host the institutional repository and we're heavily involved in the Herdick and ERA program. So therefore, you know, it's essential we have good author identification. And just to give you an idea, UQ Publish is about 10,000 publications a year. So it is important that we assign the right paper to the right academic, and we do gather our data through Web of Science and the Scopus API and researcher ID, and we do know there's some under-reporting, and when we implemented the Scopus API last year, we were quite surprised how many new records we were going. We got because although we knew there was under-reporting, we didn't know how much under-reporting. And we also think it's best practice. And we've got a lot of support from the Office of the Deputy Vice-Chancellor of Research, and they're behind the project, but the library will lead the project. And so the university, like many other universities, have various policies on author affiliation on all of the papers. With our workflows, all new academic staff are encouraged to have a researcher ID or to link their existing account to UQ, and we encourage them to clean up their Scopus identities and we provide information to do that. And with our researcher ID program, it's currently managed by administrators only, but we'd like to move to an author-managed process in the future. And for instance, we've just gone out and asked a lot of staff for some information for ERA and we've had over 100 requests for people to link their researcher IDs. So the nuances around the implementation is that UQ researchers, probably like every other researcher, is really resistant to admin work. So we're going to have to try and make it as easy for them as possible. And we do appreciate that while we know that the library is very important, researchers are more likely to take note of the project if they're directed to by the DVCR research or the VC. So there are some things we've got to take into account, and as part of our communication strategy, we want to get the message out from the highest level possible within the university. And then because this involves IT work, we need to prioritize the development with the work to support ERA and Herdick. So our plans for the future is to have an application where UQ academics can manage their online identity really easily. And we just again want to promote the e-space URL where it's possible for UQ academics to have a single URL and link to their online identities and even sort of link that to their email signature for easy access to their identities. So that's what UQ's experience with AllKid is. Thank you. So over to you, Nathaniel, thank you very much. Thank you. Hello all. My name is Nathaniel Lewis. I'm director of research reporting, analysis data and systems at the University of Sydney. Our group manages the Herdsee and ERA submission for the University of Sydney. Among other things, we manage all the research information of our researchers, contracts and grants, and maintain the information systems that hold all that data. Part of the work that we do is to look at other technological initiatives such as AllKid and see what sort of value that could add to, I guess, the management of research information. So today we've been asked to talk a little bit about how we've engaged with AllKid. What are some of the business drivers? What are the fit to services and some integration challenges and any sort of sectoral issues we might want to raise? So for the University, we've identified AllKid as a good opportunity. It's going to work across the research portfolio library, our ICT and ANZ, of course. It's been presented to our senior executive group in the research committee. So all the deans in the ADRs at the University are well aware of AllKid and now it's up to us to think about how do we implement AllKid across Sydney. So we joined AllKid as an institutional member in June this year and so we're now at the planning stage and considering what are the policy implications for AllKid that will lead to some of the implementations as well. Part of those implementation challenges around integrating it with our internal systems we use D-spaces as our main repository for our ERA and HRC submissions. We've got a software platform called ERMA which is our main repository of research information. We also need to integrate with HR and ICT going forward if we're looking at identity management. In terms of the alignment to our research strategy we see AllKid as another tool to enhance the data quality, the accuracy and consistency of information. It allows us to do some triangulation between Scopus and Thompson and various funding bodies as well. And I think one of the biggest challenges we face is around mapping affiliations of researchers. AllKid presents us with an opportunity to enforce persistent identifiers. We've got some researchers with upwards of, I think there's one specific example where they've got over 200 identifiers in Scopus. So this gives us an opportunity to minimise those different researcher IDs and bring some automation to that process. Again, managing duplicates across the different systems and mapping those affiliations and publications, grants, data and also help us with our open access compliance so we can report against AllKid implementation. For us it's about maximising the performance of the university in terms of its research performance to help with the reporting whether it's government compliance reporting and also rankings as well. It's about, again, specter data quality and the accuracy of that data and how well the university is represented. In terms of the fit to services, it's going to impact across a few things and this is in relation to the policy procedures and workflows. We need to think about how it fits with any existing governance arrangements at the university, how it's going to be formulated and how we're going to implement that. We are looking at creating an AllKid identifier for all Sydney researchers so we need to think about how we're going to populate and maintain those records going forward. We see AllKid as supplementing the current Scopus and Thompson publication sweeps that we do to fill in the gaps for the under-reporting piece and we also see AllKid as an opportunity to help manage those non-citation-based disciplines that aren't necessarily as well represented in the major publication and indexing houses. In terms of the technical and policy considerations, we are looking to see how we can minimise researcher burden. In respect to that administration issue, we don't want to enforce another identifier on them. We want to make it as seamless as possible. It should be happening behind the scenes so that's a primary issue for us to consider. There are technical constraints on the AllKid record population policy. We'd like to be able to populate all records in AllKid on behalf of researchers, but there's some issues around AllKid and how that can be set out, but that's more around the philosophy of AllKid and that the researcher owns the record, which is good, but if the researcher owns the record but they don't want to administer the record, how do we go about doing that? So that's something we've got to work through. We do know that the information we collect on behalf of researchers on their research outputs is quality assured. We do the checking, the verification, we make sure we've got a record, we've got an actual publication and a file, and that's populated into a research information system. So we want to leave that to make use of AllKid and make it an effective tool to use. There's some questions around the capacity and service levels of AllKid, but I think at the round table we had the CEO of AllKid there and they reassured everyone they've got plenty of resources and they can scale up their service as well if people do come online in a hurry. We've got, I think, at least 25,000 researcher records we need to manage and over 150,000 publications, so we need to make sure that if we are populating these AllKid records that it can cope. There are also some concerns around the privacy and location of service in the US. Advice from our own legal counsel is that it's not such an issue because it's publicly available information. That would be dependent on your local provisions. There will be the usual issues around integrating the existing ICT and HR systems. It's how far do we want to use AllKid and integrate it into the systems is the next question. Do we make it something that everybody signs up for when they join the university? How do we go about linking the existing AllKid IDs into their university identifiers? And then how do we implement to this at the local faculty level? At the university there are 16 different faculties and they are the ones that run their own show, for example. So we need to provide the advice that enables them to perform an implementation of AllKid at a local level and work with them to do so. In terms of the future of AllKids, we see it as providing a consistent approach for research outputs and records management. I'd like to see it as a good supporting tool for research and mobility. It saves us having to re-enter all their research information again and again. If they've got an AllKid ID and it's well maintained, that will make transferring their research outcomes and outputs into and across universities more easy. We would like to see it as a tool for identifying collaboration opportunities and strategic recruitment. Better alignment to Scopus and Thompson databases. And in terms of sustainability, I think support from publishers of AllKid is key. Thank you. I'm now going to pass over to Simon Porter. So I guess my presentation is somewhat similar to Heather's because it's focusing around... Oh, and if I'm yours because it's focusing around how AllKids primarily facilitate publications collection in the context of collecting publications. Not only for Herd C, but as Heather's indicated increasingly, it's about universities being able to effectively represent online presence of their researchers and comprehensive representations of everything that they do. So I want to put this in the context of... I want to put AllKid in the context of how we've gone about collecting publications over the last 14 years or so. And in 2001, almost all the way through the... 2008, it really was a manual exercise of keying in publications. Along the way, it's become more of an enterprise in harvesting publications from Scopus and Web of Science. To the point now where AllKids evolved into it is now not just harvesting publications, but how do we refine the publications that we can harvest so that we can limit the noise that we get? So that's pretty much the trajectory that I want to talk about. So from an Australian perspective, this is, I guess, fairly obvious, but it's worth pointing out that publication collection is a multi-million-dollar enterprise. This graph that you see on the screen is basically all the publications that we've collected for all the departments in the University of Melbourne. The different colours are the different types of publications. So that sort of purpley colour are journal articles. The light blueish colour that you can see in the graph sort of across the top are a conference papers. You can see there's about halfway down there's a chart that looks nothing like the other ones. That's the VCA in Performing Arts. So they're doing lots of different sorts of publications. The purpley colour in the bottom are books and book chapters. So you can see over 14 years we've collected over 150,000 data points of publications. Each of those data points took at least 15 minutes to enter, involved reviewing it by multiple different people. The amount of time that it takes to do all those things quickly adds up. So the context in which Auckland, in which reducing the amount of work it takes to collect these things is really quite a serious enterprise. So I've already talked about the evolution of how we collect the data. For the University of Melbourne, probably up until about 2009, the only way we collected that data was through manual entry. So either it was researchers keying in their publications or publications coordinators on their behalf. Probably we came quite late to this, but we got wise that we could harvest publications from a source like Web of Science or Scopus and then refine them and process them. And we started doing that with just Web of Science in 2010. We got to the end of that process and I quickly realised that whilst it was effective doing it for Web of Science, we didn't have to turn around to it for Scopus and then we'd have to turn around to it for RPEC, then we'd have to do it for Archive and PubMed and every other source that came on, which was really unsustainable. So about 2012 we implemented Simplectic, which enabled us to harvest from multiple sources and basically help us build combined records. So we have a publication record that can say, here's the publication, here's the publication representation of that in Web of Science and Scopus and so on. So over time, publications and interaction has moved from data entry or emailing your publications coordinator to... We think we've found these publications. Can you please confirm whether we've got it right? So this is a screenshot from our UAT instance of Simplectic and you can see that if Jim McCluskey were to log in, he'd see that he has 32 journal articles that he needs to claim and he gets a screen somewhat like this. So he can quickly go down and tick, yes, this is mine. No, that's not. So that works well, provided we don't offer up Jim or other researchers too many false positives. So we don't want to create a situation where a researcher has to wade through hundreds of publications to find the 10 that are actually theirs. So the way Simplectic works for those is that what you do is you go in and you say, okay, for this researcher, these are their search terms and these are their organisation affiliations and it goes off and searches each of the interfaces to try and find publications that match those search terms. So obviously how good you get, how well you retune those search terms determines how many false positives come up. Because we've been collecting publications for 14 years and because we've had Simplectic running in the background for a number of years, we know which publications belong to our researchers for the previous 14 years and their equivalent Web of Science record or their equivalent Scopus record. Because we know that, we can query those records to find out the actual or the search terms that they used on those records and the actual organisational affiliations that they used. So before we roll out Simplectic to our researchers, we can pre-populate all of those search terms in that record and that's helped us to reduce the amount of false positives for some researchers, probably about for an 80% of the researchers, we were able to, by pursuing this strategy, we were able to reduce the amount of false positives that appeared for researchers for some people. So you'll see along the x-axis, you'll see people who used to have a lot of publications pending so for approval, there's a data point at 1500 that's been resourced to zero. But on the other side, you'll see people who used to have fewer publications pending in Simplectic and now have lots of publications. And basically this is because configuring search terms for researchers can only take you so far because some people have such common names and that no matter how far you configure the search terms there really just isn't a way of providing searches that just bring up their search results. And to give you an indication of who they are, if we look at the top 20 or so researchers who are getting lots and lots of false positives, we can see that predominantly they're Asian surnames of some sort and really there is no approach or help for these people in terms of configuring search terms, some researchers just need orchids. So as part of phase two of our Simplectic rollout we'll be targeting these researchers for orchid IDs first because it's these researchers for whom having an orchid ID can help the most. And our strategy for doing that will be again to use Simplectic and to use Simplectic to go in and get a researcher to go in and configure their orchid through this sort of interface. So that's one use of orchids for our researchers but the real reason we started getting interesting in orchids was not for our researchers but it was actually for our graduate students because with our graduate students we've got a problem of trying to work out what happens to what graduate students have done after they leave the university. So after they left the university we've got no recourse to say hey we think you've done this publication could you please confirm it. So we need to know what have they published based on the research that they conducted before they left. Can we claim that and her to see. We also really want to know where have they gone and what does their academic career look like three years after they left the university for instance. So for these reasons we've now got a university policy which requires graduate students to have an orchid and we will be managing this process through Simplectic. So our idea is to encourage graduate students to have orchids but not just have orchids but actually use that orchid as an active part of their research career. So it's not just having one but it's actually owning it and ensuring that all of their outputs are going to be connected up to it. So really once we've done that we've got a mechanism to glue our knowledge about who our graduate students were to an involving data set of what they're doing in the open world. Our approach for graduate students unlike some other orchid implementations for graduate students we won't be minting orchids for our researchers. The reason for that is we feel that although there are methods where you can create orchids on behalf of your graduate students for all of your researchers the risk is that you end up with orchids that have been created for graduate students which they don't own. So you think you've created an orchid for them but basically they've just ignored the email that came through from orchid once it's been created. There's an orchid out there for them but the first time they go to use an orchid they'll just create a new one because they've completely forgotten that that process has happened. So the risk of creating unknown orchids is too high for us to consider minting them for our graduate students. We will be emailing our graduate students just like orchid does but we will be asking them to go into symplectic and configure their orchid in their symplectic account. The process within symplectic is quite straightforward for this. If users don't have an orchid they can get one as part of the process of configuring their orchid and symplectic and basically it's maybe one or two clicks more than doing it through a minting process but the end result is we now have if we can get students to do that we now have a process where we can track those students that have orchids in our system and we also know that those students have actually undergone some activity which indicates that they might have a better chance of owning that orchid and we've now got a solid practice to track those who haven't engaged with the system with follow-up emails. So we can really track our success of how that flows. That's really the orchid story and I guess reflections on the trajectory of this we've talked about moving from data entry to data glue and we've talked about how we harvest publications how we need to use orchids to glue our knowledge of graduate students to evolving information in the world about them but I guess the reflection is that it's not just publications that we want to glue back to our data sets. Increasingly it's grants not only because we've now got NHMRC requirements that want to track the publications that belong to those grants and ensure that we've got open access publications for them. It's also research data it's also potentially academic history for researchers and we really feel that orchids are key pieces of the puzzle to help us do this in the future. So we were asked also to reflect on the orchid workshop and I guess the biggest takeaway I had from the orchid workshop is that there are multiple levels of orchid subscription you can subscribe as an institution at various different levels and you can also subscribe as a nation which offers discounts. I got the sense from the room that one of the things that we should ask a call about is whether we can pursue a national subscription through the call membership. So that's just my final thought and that's my presentation. So looking forward to seeing you at our next ANS webinar and thank you very much.