 So I guess my presentation is somewhat similar to Heather's because it's it's focusing around oh Anaphanials because it's it's focusing around how orchids primarily facilitate publications collection in the context of collecting publications not only for her to see but as Heather's indicated increasingly it's about universities being able to effectively represent online presence of their researchers and you know comprehensive representations of everything that they do So I want to put this in the context of I want to put orchid in the context of how we've gone about collecting publications over the last 14 years or so and in 2001 almost all the way through the 2008 it really was a a manual exercise of you know keying in publications along the way it's become more of a An enterprise in harvesting publications from scopus and weather science to the point now where We're really sort of and the point where orchids evolved into it is now Not just harvesting publications, but how do we refine that the the publications that can harvest so that we can Limit the noise that we get which is so that's pretty much the trajectory that I want to talk about so From an Australian perspective, this is I guess fairly obvious But it's worth pointing out that publication collection is a multi-million dollar enterprise This graph that you see on the screen is basically all the publications that we've collected by department for all the departments in the University of Melbourne the different colors the different types of publications so that sort of Purpley color or journal articles the light bluish color that you can see in the The graph sort of fought across the top a conference papers you can see there's About halfway down. There's a chart that looks nothing like the other ones. That's the VCA and performing arts So they're doing lots of different sorts of publications The purpley color in the bottom are books and book chapters. So you can see Over over 14 years. We've collected, you know over 150,000 data points of publications Each of those data points took, you know, at least 15 minutes to to enter Involved reviewing it by by multiple different people the the amount of time that it takes to do all those things quickly quickly adds up so the context in which awkward in which Reducing the amount of work it takes to collect this things is really quite a serious enterprise So I talked about started already talk about that the the evolution of how we collect the data Probably up for the University of Melbourne probably up until about 2009 the only way we collected that data was through manual entry So either it was researchers keying in their publications or publications coordinators on their behalf Probably we came quite late to this but we got wise that we could harvest publications from a source like whether science or scopus and then you know refine them and process them and We started doing that with just web of science in 2010 We got to the end of that process and I quickly realized that whilst it was effective doing it for web of science We didn't have to turn around do it for scopus and then we'd have to turn around do it for repic Then we'd have to do it for archive and PubMed and every other source that came on and which was really unsustainable. So we About 2012 we implemented symplectic which enabled us to harvest from multiple sources And basically help us build combined records. So we have a publication record that can say Here's the publication. Here's the publication representation of that and we have a science and scopus and so on So over time Publications interaction has moved from data entry or emailing your publications coordinator to we think we found these Publications, can you please confirm whether we've got it right? So This is a screenshot from our UAT instance of symplectic And you can see that if Jim McCluskey were to log in he'd see that he has 32 journal articles that he needs to Claim and he gets a screen somewhat like this so he can quickly go down and tip. Yes, this is mine No, that's not so that works well Provided we don't offer up Jim or other researchers too many false positives So we don't want to create a situation where a researcher has to wade through, you know hundreds of publications I to find the 10 that that are actually though actually there's so the way symplectic works for those is that what What you do is you go in and you say okay for this researcher These are their search terms and these are their organization affiliations and it goes off and searches each of the each of the Each of the interfaces to try and find publications that match match those search terms So obviously how good you get how well you retune those search terms determines how many false positives come up because we've been collecting publications for 14 years and because we've had symplectic running in the background for a number of years We we know which publications belong to our researchers for the previous 10 14 years and and their and their equivalent web of science record Or and their equivalent scopus record because we know that we can query those records to find out the actual Or the search terms that they used on those records and the actual Organizational affiliations that they used So before we roll out a symplectic to our researchers We can pre-populate all of those all of those search terms in that record and that's helped us to Reduce the amount of false positives for some researchers probably about for an 80% of the researchers. We were able to buy Pursuing this strategy. We were able to reduce the amount of false positives that that appeared for researchers For some people so you'll see Along the x-axis you'll see people who used to have a lot of publications pending So for approval there's a data point at you know, 1500 that's been resourced to zero But on the other side, you'll see people who used to have fewer publications pending in symplectic and now have lots of publications and Basically, this is because Configuring search terms for researchers can only take you so far Because some people have such common names and that no matter how far you configure the search terms There really just isn't a way of providing searches that just bring up their research results and to give you an indication of who they are if we look at the top 20 or so researchers who Are getting lots and lots of false positives. We can see that predominantly their Asian surnames of some sort and and really there is no There is no approach or help help for these people in terms of configuring search terms Some researchers just need orchids. So as part of phase two of our symplectic rollout will be targeting these researchers For orchid IDs first because it's these researchers for whom having an orchid ID can help the most And our strategy for doing that will be again to use symplectic and use symplectic to go in and get a researcher to Go in and configure their orchid through this sort of interface So that's that's one use of orchids for our researchers But the real reason we started getting interesting in orchids was not for our researchers But it's actually for our graduate students Because with our graduate students, we've got a problem of trying to work out what happens to What graduate students have done after they leave the university? So after they've left the university, we've got no recourse to say hey, we think you've done this publication. Could you please confirm it? So we need to know, you know, what have they published based on the research that I conducted before they left Can we claim that and heard see we also really want to know where have they gone? What and what does their academic career look like three years after they left the university for instance? so For these reasons, we've now got a university policy, which requires graduate students to have an orchid and we will be managing this process Through symplectic. So our idea is to encourage Graduate students to have orchids but not just have orchids but actually Use that orchid as an active part of their research career. So it's not just having one But it's actually owning it and ensuring that all of their outputs are going to be connected up to it So really once we've done that, we've got a mechanism to glue out our knowledge about who our graduate students were To an evolving data set of what they're doing in the open world our approach for graduate students Unlike some other orchid implementations for graduate students. We won't be minting orchids for our researchers the reason for that is we feel that Although there are methods where you can Create orchids on behalf of your graduate students for all of your all of your researchers The risk is that you end up with orchids that have been created for research for graduate students, which they don't own So you think you've created an orchid for them But basically they've just ignored the email that came through from awkward once it's been created There's an orchid out there for them. But the first time they go to use an orchid I'll just create a new one because I've completely forgotten that that process has happened So the risk of creating unknown orchids is too high for us to consider minting them for our graduate students We will be emailing our graduate students just like orchid does But we will be asking them to go into some plectic and attach their and configure their orchid in their symplectic account The process within some plectics quite straightforward for this if users don't have an orchid They can get one as part of the process of configuring their orchid and symplectic and basically it's maybe one or two clicks more than doing it through a minting process But the end result is we now have If we can get students to do that We now have a process where we can track those students that have orchids in our system And we can also and we also know that those students have actually undergone some Activity which indicates that they might have a better chance of owning that orchid and we've got now got a Solid practice to track those who haven't engaged with the system with follow-up emails So we can really track our success of how that how that flows that's really that the the orchid story and I guess Reflections on the trajectory of this we've talked about moving from data entry to data glue And we've talked about how we harvest publications how we need to use orchids to glue our knowledge of graduate students to Evolving information in the world about them but I guess the reflection is that It's not just publications that we want to glue back to our data sets increasingly its grants Not only because we've now got NHMRC requirements to that want to track the Publications that belong to those grants and ensure that we've got open access publications for them It's also research data. It's also potentially Academic history for researchers and we really feel that orchids are key piece of the puzzle to help us do this in the future So we were asked also to reflect on the orchid workshop And I guess the biggest takeaway I had from the orchid workshop is that there are multiple levels of Orchid subscription you can subscribe as an institution at various different levels and you can also subscribe as a nation which offers Discounts I I got the sense from the room that one of the things that we might we should ask Call about is whether we can pursue a national subscription through the through the call membership. So that's just my final thought