 Hello and welcome to if ETD data falls in a general's repository. Does it make a fair sound. My name is Andrew McKenna Foster, I'm a product specialist at fig share, and happy to be presenting this poster at the US ETDA conference 2022. So we're looking at a zoomed out image of the poster. I'm going to zoom in and go through each section here. But as an introduction. We conducted this study because we were interested in how graduate students are sharing data and other research outputs related to ETDs. It's especially important that researchers in all disciplines understand how to share research outputs in a fair way, findable accessible interoperable and reusable publisher mandates and policies funder mandates and policies are expanding. We just saw that with the Office of Science and Technology policy memo that came out recently. And so graduate students who are just starting their careers can benefit from sharing research outputs associated with their thesis and dissertation. It's good practice. It is something that can be cited, and it proves it gives them credibility as they begin their careers. So we set out to assess metadata for objects that have been shared in relation to an electronic thesis or dissertation. And we were looking specifically at metadata projects in the fig share repository universe. Individual researchers can use fig share.com. They can have a free account and share outputs there. Publishers, government funding, government and funder agencies, institutions like universities can license fig share and have their own repositories. All of these repositories are searchable through a federated search interface and the API. So we collected records for data sets, figures and media, where the metadata contained the words thesis or dissertation. We collected 1000 records searching for each of those outputs, output types, and we ended up with 2,606 records. We cleaned them to match to make sure that they were part of an ETD as best we could tell. We assessed them for fairness and we were looking at things that the platform doesn't do automatically. So fig share will automatically provide a persistent identifier. It'll provide a license or it'll give you the option to apply a license. So standard metadata schema. So those are all checking boxes for fairness. What it doesn't do is it doesn't know what the research is about. And so it relies on the user to enter that information. So we were looking at findability. Specifically, are the files described with rich metadata interoperability. Are there lots of references or some references to related objects, especially the thesis or dissertation. Are the, is the metadata reusable, making the file reusable by richly describing the research. And we had proxy measures to assess these. So we looked at the length of the title and description or metadata richness. We looked at the number of keywords and categories. And then we looked at the number of reference links in the metadata. So that's for interoperability. Our records came from fig share.com. And we assume to those were shared directly by students. We also had records shared through institutional repositories from universities, and we assume to those had some assistance from curators. So we were actually able to compare the fairness of each set of records. Here's what the records looked like. Just briefly point out that, you know, on the left is a data set. Don't worry about reading the tiny text from a university, University of Sheffield. On the right in the upper right is a media record for a video and in the lower right is an image. What picture refers to as figures, you can view all these for the QR codes for them. And the metadata is down below. And the red text that you see in two of the images is one of the links to a related record, perhaps the thesis. So here are some results. These are box and whisker plots showing the counts of words in the title and description counts of categories and keywords counts of references and the views per month of the records. The left boxes are for institutional related records. The right boxes, the darker color are those shared directly on figshare.com, we assume by the students directly. The asterix indicate significant differences. And you can see that records associated with institutional repositories have longer titles, have longer descriptions, more keywords, more references. There was no difference in the number of categories between the different record types. And records in institutional repositories had more views per month, whether that's related to the richer metadata. We don't know, could be just because the repositories related to a website that has a lot of traffic already in the university's website. These were mainly published in the last three years for both figshare.com and institutional repository records. But there were many more records from past years on figshare.com, and the fair principles were published in 2016. So, you know, these records could be a reason that the figshare.com numbers are lower. However, if you look over time at those measures that we looked at, there is no real pattern. So in this lower image, the four charts of length of title, description and keywords and references, you can see there's no obvious patterns. Institutional records have seemingly always had more references and seem to have longer titles in the recent years. But the figshare.com records aren't necessarily going up or necessarily going down. Perhaps in a couple of years, if we redid this analysis, we'd see hopefully an upward trajectory for both. So our conclusions are that ETD related data and other outputs shared in institutional repositories are fairer than those shared directly by students in a generalist repository. And they seem to have higher views per month. So that's a benefit to students. There doesn't seem to be an increase in the metadata richness or interoperability over time for student shared records. There are a lot of trainings out there workshops. Increasingly, you know, student mentors have data sharing skills. Perhaps it's still too early to see the effects of those programs and workshops in published metadata. Figshare has been able to use this information to produce some materials and plan materials that might help students in the future. So, you know, we look at descriptive or fair metadata as having descriptive titles, a background and methods in the description, five or more keywords, and of course reference links. And so we do have a new user interface coming in development that may help here. It may be easier for students to use it may affect this swap to account for that in the future, we redo this analysis. We have launched a help page to help students and curators fill out metadata related to their thesis or dissertation, and we are planning webinars for both students and curators around this as well and incorporating some of this information. If you're available at this figshare record and please be in touch if you have any questions about this and want to talk about etd data. Thank you for viewing this presentation.