 Welcome to the webinar on Implementing Pure at the University of Western Australia. So my name is Keith Russell, I work for ANS, I'm a Partnerships Program Manager and I'm a host for today and a big thank you to my colleagues Suzanne Sabine and Natasha Simons who are co-hosting the webinar with me today. This webinar is part of a larger series of webinars that we've been organising here at ANS on what we call Research Data Information Integration and we've been doing this together with Coal, the Council of Australian University Librarians and I'd like to acknowledge and thank the Council of Australian University Librarians for also broadcasting and making this possible. The first five webinars in this series have been around systems that use at institutions capturing information about research data, so these have been things around research data management planning tools, systems managing research data ethics, data storage, data publishing and data preservation. So just to give you a bit of context around this, so research we see is changing globally and one change is the greater focus on research data as a real first class research output and so we're collaborating with the research sector here in Australia to promote better management of research data and encourage change in research culture so that research data will be actually will be recognised as that first class research output to make sure that research data is better managed, better described and published and that in turn will allow for better reuse and availability of research data down the track and one of the elements in this process is capturing information about research data and making sure that it is discoverable, not just the data itself but also associated information about the data. So as we've been talking to institutions around Australia we've noticed that a lot of institutions are revisiting the systems that they had in place or still have in place and these systems are used often to collect information about the research data. So as we were talking one of the institutions we were talking to was the University of Western Australia and they mentioned that they were implementing PURE as a research information management system and not only were they implementing PURE as a RIMS but also looking at how they could capture information about research data in that system so we said ah that's really interesting and we'd love to hear a bit more. So we're really grateful that Katina was willing to provide some time and prepare some slides and give a presentation about how the implementation of PURE has progressed and especially from the research data side of things what it means to capture information about research data in PURE and make that available. So please note that this is PURE is just one of several research information management systems that are out there on the market and we also would like to in to the future see if we can share experiences of other systems in this space too. So we'll be managing this webinar today and include and try and bring that discussion together and see any questions you might have. I would now like to introduce our speaker for today. So Katina Tufeces works at the Library of the University of Western Australia. She works there as the coordinator in the research publication and data services unit. In this role she was involved in the implementation of PURE. So I'd like to welcome Katina and thank her again for making time available to share her experiences with us. So over to you, Katina. Thank you, Keith and thank you to Anne's for asking me to present on this topic. I'm just going to say that I'm obviously here to talk about the UWA's research data repository and data migration project in particular. We're currently nearing the end of this project to migrate from our current system, which is D-Space to PURE and I'm speaking to you from these red arrows. So back in 2009 to 2013, just to give you some context on how we got to where we are now, the UWA Library received Anne's funding for three major research data projects and two of these projects produced our current data repository systems, research data online, which is the D-Space system and research data hub, which is our harvester from the D-Space system into research data Australia. I'll soon display a slide to show how complex these systems were and how they interacted with each other. Research data online, the D-Space system, it allowed our researchers to directly upload their data sets via their FEMI login. In total, we have 136 data sets in RDO at the moment. And I'm not going to go into great detail about this current system because we will be decommissioning it soon, but I'd like to take you through some snapshots so you can see how our current data sets are viewed and what our researchers are currently using. So this first slide will show you a list of communities on the left hand menu. It's a typical D-Space kind of environment and the collections within each community are at the bottom of the screen. The next slide is an example of a dataset record as they currently sit in UWA and the RDO system. So that's one of our major open data projects that we've got there that feeds into research data Australia. And that schematic diagram I was talking about, well, this shows all of our UWA sources and how the systems interact with each other, particularly for our research data sets. The web applications were highly customized in D-Space in particular to meet the initial requirements from the ANS projects. So you can see that D-Space is used to upload the data sets and import descriptive metadata and Vivo was used to link grants, researchers, publishing, publications to feed straight through to research data Australia. Our intention at the start of the project was to eventually use Vivo as a researcher profiling system, but that's something that never eventuated, which is very common in these sorts of projects. You can never tell what's to come. If all that looked confusing, well, it is, and we really wanted to simplify our systems within the institution. We also wanted to address longstanding issues which we had with our current system, particularly with community creation. You saw those communities on the left hand side that couldn't be done by the library. We had to work with IT and their priorities to code to change the code and make script changes when researchers wanted to upload their data. It was a very lengthy, tedious process. Researchers were left waiting and that's never an ideal situation. And as a result, the system was also considered very clunky and, you know, with good reason. Also, the D-Space version that we're using is so highly customized that we can never update it. And there are consistent upload tool errors with a really dated appearance and functionality. DIY creation is also a manual process. It's not done automatically when researchers upload their data sets. And there's no reporting mechanism in the system. So everything needs to be done manually there as well. This space, as I said, needs Vivo to work hand in hand to harvest into ends. Unfortunately, there are consistent harvesting errors that we get and we're using a staging database which is being used to link data sets, people, grants, publications, which itself has recurring issues. So we have an aging system and with all of its existing problems. It's served us well, but definitely time for a change. And we recently acquired a new system to manage our publications and theses. This means that we're effectively managing two separate systems, two separate repositories, which is completely unnecessary. Pure is a current research information system, a CRIS, which holds our pubs and theses and these are linked to our grants and our researcher profiles. Pure enables us to build reports, carry out performance assessments, manage researcher profiles and enable research discovery. Digital theses records are manually created by Pure by our staff in the research and publications and data services unit here at the library. The research output content under the thesis template, that's just a snapshot of the admin section, how this is also what researchers see when they want to upload a thesis. And on the left hand side, the theses there appear on the menu. PlumX, we have been integrated last year and we built an integration with Pure to create a PlumX dashboard and track metrics using data from the repository. We collate the metrics by various groups, researchers, school faculty and we embedded the PlumX widget into the UWA research repository. We're currently also investigating free altmetric API to implement and display altmetric donuts on the repository output records. The donuts will allow users to click through the four altmetric detail pages. And we have also recently integrated Pure with Orchard and researchers can now log into Pure, create and connect their Orchard ID. Once a researcher has created their Orchard ID, the Pure has the right to push output and limited affiliation data with their Orchard records. This functionality was made available when Pure was rolled out to researchers earlier this last month in April. And data sets, unfortunately, can't be pushed from Pure to Orchard, which is something we've discovered. So we embarked on the dSpace to Pure migration project. The project aims to deliver a range of benefits, improving the architecture by consolidating our research repository systems, reduction of time spent by IT managing and maintaining the old dSpace system, getting a better user experience for researchers who want to upload their data sets quickly and UWA researcher compliant with new institutional requirements and publisher and funder research data management requirements. And there is a greater potential for ongoing development of the service because it's part of a greater global community of users and they use the Pure system for research data management, particularly in Europe and the UK, enhanced system support through vendor partnership. So when we have issues, we can lodge jurors and have Pure support help us out there and also Pure enables us to link together data sets and publications arising from the same research grant, which is which is exactly what we needed. Our project team and project board members are a mixed group of individuals from both the library and IT. UWA has recently gone through a renewal process and what was was known as the research unit in the library is now known as research publication and data services. And this small team created quite a few documents for the project. And these are just a few. And currently we're going through the implementation process and redesigning the pure portal in development so that we can roll out quickly. So as with every project, there's always game game stoppers and it was imperative that we ensured that there was no data loss from the current data sets to pure so that we had no loss of DIYs or any any of the metadata that was input in there or their files associated. We really wanted to maintain the ability to harvest into research data Australia and obviously meet the minimum risk as requirements for that and enable researchers to again upload their data sets with minimal intervention from library or IT staff. And one of our hopes was to automatically produce DIYs for them as they're uploading their data sets. So if we look at the migration project, we asked Elsevier, the owners of pure to automatically transfer the existing data sets from D space to pure. But for various reasons, we decided that we should manually transfer our 136 data sets because it would take a lot of time to map the transfer, which we didn't have. And there were so few data sets that human error was considered minimal. Also, there were bound to be errors with that automatic transfer. So where fields don't map properly and we would need to manually test the transfer anyway. And pure has also has a lot more fields available for data sets and such as licensing. And we would need to contact each researcher individually to determine the licensing for each data set, which already exists in D space. So we wanted to minimize doubling up on our on our resources. So we were also told that we wouldn't be able to automatically transfer the data set files themselves. The records, the metadata records would be fine. But the files would be lost. So we would manually have to transfer them anyway to each record. So we did the whole lot ourselves in a matter of few days. And it was actually pretty simple. I also want to highlight one of the recent accomplishments in the project. And that's the development of a crosswalk from pure to research data Australia. I want to mention Melanie Barlow, who is a technical analyst for Anson, the ACT, and she to get we together developed a crosswalk, which would be beneficial for any other institutions in the future who wish to use the harvest, who want to harvest their data sets into research data Australia. Pure doesn't include data sets in the OAI handler. And therefore we needed to we needed this method to harvest data sets into RDA. We've raised this as an issue with OSEVIA and as a potential development for future release releases of pure. But for the time being, we've developed this crosswalk as a workaround and this has been set to harvest at regular intervals. So none of this has been done outside of the demo environment. But we know that it works. Along with this are all the obligatory procedures and test scripts that we'll need to do each time we update, upgrade pure to a newer release just to ensure that the harvest isn't affected. I mentioned earlier that another one of our must haves from the project was to automatically generate DOIs for the data sets. Currently, pure is configured to mint DOIs but only directly from data site itself. That that created a bit of a roadblock for us. From our point of view, it would have been great if we could configure pure to meet data site, to mint the data, the data site DOIs through the ends back in there to mint the DOIs. But we're currently in discussions with ANZ and OSEVIA and we do have manual workarounds for this. So what we've done when a researcher is submitting their data set, they will have an option to send it for validation or keep it aside so that they can work on it later. So what we've included in that step for our staff is as we're looking through the data set to validate it, we will then go to ANZ and manually create a DOI and put that into their submission for them. Of course, that's not an automatically generated DOI where they don't have to wait for it, but it's the next best thing at the moment. So as we speak, we're testing the data sets in the development portal site for pure. And this is the data sets browse page. And we've just got a couple of test data sets there. Once this is complete, we will move this to production and begin the rollout to our researchers. If we click on the data sets in the left hand menu, now this will be what the public will see if we click on the data sets, the full list of data sets would appear here and you could sort them by title, created date or modified date. This is an image of the data set metadata record in the front facing portal and there are four tabs across the top. The overview tab shows the full metadata record, including the openly accessible files that are on the right with the DOIs. We have an internal and external creator in this example and the internal creator links to the researchers profile. And this particular data set can be linked or all data sets can be linked to grants, projects in this. That's what we've called them in pure projects. Related links such as websites are on this page. And if we scroll down that screen, you'll see that we've got links and related content such as other data sets that they're related to. Remember, this is the development site that we're testing all these in. And we have got prizes, activities, equipment and press and media, which we haven't enabled in production for our research pubs and theses at the moment. But it's definitely something that we're looking into currently. There is also a statistics tab for each data set. And we're just we're just having a look at what the scopus citations actually means for data sets there. We're not entirely sure we've got a we've got a grasp on that. But the plum X integration allows us to gather data set stats. And at the very bottom of that screen, there's a graph which shows the number of downloads from the pure portal itself. As this is a development site, this isn't visible if you go to our pure portal at the moment. The reference style tab generates a data site citation for the data set. And as I said, this was the front end in the admin view of our data sets. This is what pure looks like. So when a researcher would log in with their theme, they would see a more personalized view of their profile and what data sets they're linked to. On the left hand side of the menu there, you'll see that this is the admin view and this is what we see all of the 136 data sets, which we have transferred from D space, waiting for us to make public once the rollout occurs. We open up an existing data set file from within the back end. This is what it would look like when a researcher would want to upload a data set. They would answer or populate all these fields as appropriate. And we can scroll down and see that we can scroll back up. You've got the title, description, temporal coverage, geolocations and all the people that are associated with the data set and various roles that they have. And further down the screen, there's the data availability section where you upload, you can drag and drop your data sets directly in there and apply a license to them, whether you can also apply an embargo to them, make them private, restricted, public. You can also add physical data. The physical data is right at the bottom of that screen, actually, and you can associate any physical data with this data set and any links and any legal and ethical constraints. All of these items come up on the visible in the portal. And at the very bottom of the screen, when they want to save their work, they can save it, but not as validated, they save it as for validation. So in the future, or really soon, we'll be implementing the data sets module in the portal, hopefully within a few weeks. When this occurs, we're going to disable DSPACE for data set submission. So at the moment, our researchers are submitting to DSPACE still and we're going to redirect all the current handles and DOIs from DSPACE into the pure portal. A researcher pilot group will then be used to test the procedures and provide some feedback, including items, we're going to roll out the data sets, including items in our communication plan and promoting the service for the faculties via emails, website, liaison, communications, and workshops. We're investigating the equipment module because we've had some interest from some research groups here at UWA. And hopefully this will be enabled in the near future. This will be good because we'll be able to link pubs, DCs, data sets with the same grant and piece of equipment. Recent policy changes at UWA are expected to make an impact on the use of this service for data sets. The UWA Code of Contact for the Responsible Practice of Research will now ensure that this service is well used because, and it's also another reason why we're up against the clock with this project, research data related to publications must be available for discussion of other research. This must be managed through the UWA Research Repository. So I thank you for listening, and I invite any questions that you might have. Thanks, Katina. That was really interesting and a great overview of what you've already got in place. The first one is, how do you ensure quality metadata when the researchers upload the data themselves? Well, there is the validation step, and we have some procedures around that that we've developed for our staff here in the library. And there's the standard checking of the dataset record itself. Paula's also asked another question there around the CC licences. We've been giving a lot of information sessions around CC licences here at UWA with our current service as well, although it didn't have the ability to assign CC licences in our D-Space system, we have been giving them advice as to how they should apply it manually to their current dataset. So we're going to continue with those researcher workshops. They're incorporated, we call them researcher workshops, so they're incorporated with other library services and such as EndNote workshops and various other workshops that we have in the library. We incorporate them together. How easy were Pymec and Altmetrics integrated into Pure? Well, we had one of our staff members who are intimately involved in the Pymec and Altmetrics integration, and it was relatively easy to do. If you wanted any more specific information, we can send that through to you. If you just click me an email, that would be easy to do if you're interested in that. And then Jessica was pure implemented as part of a wider university-wide project involving your research as services and faculties. It was with the Office of Research Enterprise, and they were heavily involved with us when we implemented Pure. And she says, I'm interested in how the grant data and profiles were implemented, which sounds like it was outside of the scope of the library's part of the project. I don't know if I understand the question, but it was always part of the scope of the project. The Pure was bought by the university in order to create a better research or profiling system and streamline the grant data straight into that. So, Katina, I was just wondering, you talked about DOI minting and the possibility now to mint DOIs using Pure. Do you have any views or any recommendations to researchers as they're submitting their data sets on minting DOIs? Is it sort of standard procedure now that whenever they submit data, they'll automatically get a DOI, or is it rather sort of that's something they have to turn on? Do they get some advice from the library around that? Well, with Pure, we can enable it to automatically mint DOIs, but we just can't get that direct connection to data site. Data site mints DOIs in Australia via ands. And there is, Pure requires a password and username in order to automatically mint DOIs. So, that's set up differently by ands in Australia. So, we can't automatically mint DOIs in Pure from UWA. We've included it in our validation steps. So, when we're looking at a data set in the library, something that's been submitted, we will then, as one of those steps, is go to the back end of ands, mint a DOI manually for the researcher, and put it in there. If they have already a DOI from another location, let's say they've already put their data in FigShare, and they've got a DOI already in the researcher guides, the submission guides, we've told them you must insert it here, and then we won't mint a DOI on their behalf. We will use that same DOI. Okay, thanks. Antoinette's asked a question about training materials and communication. Yes, so we've already developed the guides for the researchers now, and the workshops are in the making at the moment. And in terms of the most successful channels of communication in introducing the new systems, well, we went straight to the deputy vice, the DVC of the Deputy Vice-Chancellor of Research, and she was heavily involved, and each of the faculties and the deans, we went to them and they filtered everything down for us. And, you know, we communicated heavily with our liaison librarians, senior librarians, just so that they have a grasp on the new system as well. Is there any provision for enabling PURE to receive dataset metadata from other systems, e.g. microscopy systems, without manual entry by researchers? So that's interesting, that one, Mark, because recently we've been approached by the Centre of Microscopy, who are importing their datasets into a different system, and there has been some discussion around that. So that's a watch this space, yeah. So they sound pretty promising about that being able to happen. I think that's all of the questions. OK, well, in that case, I'd like to thank you very much for a really interesting presentation and a great overview of the work you've done at UWA, and a really interesting example of how you can use this system to capture information about research data and use it to publish. So, well, thank you all for attending and listening and asking your questions. Thank you very much for your time. Thank you.