 So we'll go ahead and get started. So welcome everybody today, thanks for joining us. This is saying more with OSF metadata to support data sharing policy compliance. So hi everybody, my name is Gretchen Gigan. I'm a product owner here at the Center for Open Science and I'm gonna be talking with you today about metadata. Just as a heads up, we are monitoring the chat and the question screen. My colleague Mark is here on the line with me to monitor those things. I'd say if you have a specific question, please put it in the Q&A, which you can find in your Zoom tools. Other kinds of just commentary can go in the chat, but if you put things in Q&A, then we can keep track of whether or not the questions get answered. So today we're gonna talk about metadata. This is inspired by the release of some new metadata related features here at OSF. And this is really just the beginning. We'll be adding more to these features in the coming months, but today we wanna talk about our new features and we'll talk about metadata in general to get started, what it is, why it's important. And then we'll do a tour of the new metadata features in OSF and we will talk about what's coming next after that. So let's start with a little review of metadata. So very simple definition that's often given about metadata is that it is data about data. And that's very pithy, but it isn't always easy to understand. So I thought we could take a look at some examples. So a classic example of metadata starts with some data text in this case, a book about metadata as it turns out. So the text itself is the data. And then the metadata is, oops, excuse me, the metadata is the metadata about the book, the information about the book. So in this case, it's the record in the catalog or other kind of system. So that metadata record includes a listing of details about the book, the title, the year of publication, ISBN, language, things like that. So book metadata is actually pretty easy to understand and it's often really easy to find. We interact with it pretty readily. But other metadata can be kind of hidden. So for example, every email is data in and of itself, but it also has a header with information about who sent the message and when. And then there's also usually metadata that's hidden from our view that tracks identifier numbers and IP addresses. So sometimes we don't always necessarily see metadata. And of course, metadata isn't only for text. This is a picture of my cat Clementine who was also sleeping on this couch right back here. So in this case, the picture is data. And the metadata that's encoded in the JPEG file is shown in the screen on the side here. This is from a tool called Adobe Lightroom that shows you the metadata embedded in images. So there's some obvious stuff there like file name, the date and time created and technical characteristics of the image like the size and the file format. But my camera also captures things that I didn't realize it was capturing like the geo coordinates of where the picture was taken. And then if I want to using a tool like Lightroom, I can go in and manually add even more information like a title, a caption or a copyright statement. So finally then we can see that in OSF objects, we also have both data in the files and components or other elements, but we also have metadata titles, contributors, activity logs, information like that. What we'll talk about today are some additional metadata elements that have been added to OSF objects and how you can use them to increase the impact of your research. So let's take a step back and talk about how metadata fits into the landscape of research. This image takes the imagery of an iceberg and applies it to research. So our work has lots of elements under the surface, lots of actual data and materials, but the metadata and the data management plan are the two aspects that bring that to the surface and make it findable. And it leads us to all of that good data that's underneath. So it's true that some databases and search engines allow you to search the actual content that's under the surface, but metadata still plays a crucial role in a couple of ways. So in the most obvious case, metadata provides searchable text for non-textual materials like we showed with the image, but metadata also gives you the opportunity to synthesize and summarize and can allow you to give a kind of simpler overview or introduce the entirety of material in ways that it doesn't explicitly say for itself. And it gives you a place to say even other information that may not be in there at all. So who funded your research or what the types of materials are that you have, is this a conference proceeding or a journal publication? Is it raw data or anonymized data? Metadata gives us a place to put all of that information. And metadata is then a fundamental part of the whole research process, not only creating metadata, but that metadata then allows you to tie your work together as you move through the process. So if we start out by using things like persistent identifiers and standards for creating our metadata, that makes our information more easily identifiable, which then makes our data more open and easy to interpret. It can then also help link your data to the other materials that are relevant like protocols and scripts. It helps to fully document things like your preregistration and helps you to tie your preprint or your final output or outcome to all of the pieces of research that led to it. So before we move on to the demo of new features, I wanna talk a little bit more about why metadata is the specific tool we wanna advocate for. So the first thing we wanna talk about are the FAIR principles. So FAIR is a guideline that helps us make sure our metadata is as complete as it can be, but it also articulates the benefit of having metadata, what metadata can do for us. So FAIR is an acronym for findable, accessible, interoperable and reusable. So specific aspects of metadata contribute to these factors. So earlier I gave the example of metadata for an image. In that case, the text may be the only thing that makes that data findable. And even textual data is far more easily found when metadata is descriptive and uses known identifiers or terminology. Another important part of sharing research is being able to define how data can be used. So metadata is the place where we can describe how data can be accessed and used in things like licensing, for example. Having metadata also makes it possible for OSF or any other system to share data with each other. When we have good metadata about OSF objects, for example, it makes it possible for us to share that data in an interoperable way, helping your research show up other places like data site commons or Google Scholar. And ultimately, all of these practices lead to data that's reusable by explaining how resources are structured, collected or generated, or how to read or use them. Metadata turns materials from single use to multi use. So that's all really nice. This is all nice, but largely theoretical. It's a best practice, but that's really changing in ways that lead to another really tangible benefit of metadata. The past few years have seen a number of mandates and policies from major funding and governmental agencies that these best practices be followed in order for projects to be recognized and funded. So a new data sharing policy from the NIH just went into effect on January 25th. And that calls for sharing a set of common data elements, including the funder and the resource type information that our new metadata changes in OSF will support. In addition to specific guidelines like those, other agencies and institutions are supporting the creation of more mature infrastructure and data sharing. So a recent policy from the White House Office of Science and Technology Policy and the UNESCO recommendation on open science, both envision a research ecosystem that will need enhanced metadata to function. So metadata is becoming not just a best practice, but a requirement. So metadata in the OSF will now allow you to meet those mandates and to follow best practices to make your data fair. Metadata in OSF follows the data site, metadata screen schema, which is an increasingly de facto standard for research repositories. It allows you to input terms from controlled lists, ensuring that the metadata you create is interoperable with other systems and requirements. So those include standardized lists of resource types from the data site schema, funders from the cross-rep registry, licenses from Creative Commons and other licensing groups, a taxonomy of disciplines that are commonly used in institutional repositories, standardized formats for dates, author identifiers from ORCID, and institutional affiliations from ROAR, which is the research organization registry. In addition, materials themselves can be assigned persistent digital object identifiers, DOIs, and you can be sure that those will always be available. So the unique feature of OSF is that our infrastructure allows you to create relationships across the research lifecycle from preregistration through the creation of data and other project materials through the publishing of outputs like preprints. The new features will mean that behind the scenes will be creating metadata that follows a standards-based model for all of those materials, and that will create a web of interrelated data from the highest level at the kind of definition of the project down to individual files. So next I'm gonna do a little demo of what this actually looks like. So I'm gonna switch over here. This is a project that I have on OSF. It is the materials for a course that I teach on digital libraries at Penn West Clarion University. So this is the overall page for the class, and I'm gonna go into the lecture materials. So I had the lecture materials organized with a component for each unit in the course. So I'm going to, fittingly, go to the unit on metadata. So you can see right away, I have some metadata right here on the homepage. So I've got the title, myself as the contributor. Center for Open Science has my affiliation, date created. I have not created a DOI for this content, but I could. There's also another category here that is internal to OSF that I could use to categorize what this component is, a description, and a license. If we take a look at what the content is in the course, it's all kind of laid out in the Wiki. So I've got objectives, I've got a link to the lecture recording and slides, a list of readings, and a blog post that's some supplemental information that I include in the courseware. So if I wanted to add some additional metadata for this course, I can click on the new metadata tab at the top and come to a page that has most of the metadata that's already been created, but can now be edited, as well as some of the new fields. So one of the new areas is resource information. And this is where we describe the kind of material we're sharing in a human and machine readable format. So this allows a search engine or an index to index these terms and know exactly what type of materials and the language of materials that are being added here. So this, I just click that edit button and I have a list here. As you can see, the list is pretty wide ranging. This comes directly from the data site schema. So there's some really specific items in here, data set, data paper, conference proceeding, dissertation. If the material doesn't conform to one of these really specific areas, there are also some generic terms. So image and text, and I'm gonna choose for this collection. Since it's a number of materials. And then I can also choose a language. And you can see I can type it in to quickly find the language. These languages are taking from a ISO list, International Standards Organization. So it's all standardized vocabulary. And I can go ahead and save it. And now the index knows that this is a collection of materials and that it's English. Another new area is the funding and support information. So if this was a funded project, I can actually search here directly for the name of the funder. Whoops, I forgot to put national in there. And this is doing a live search of the Crossref Registry. And it's probably a bit slow because I'm on Zoom today. Let me just start over. There we go. So I would find my organization from this list for the purposes of the demonstration. I'll just pick the first one. And I also have fields here to add in specific free text information about the award, the award URI, the award number, et cetera. And if my research is funded by more than one organization, I can actually add another funder. And you can add as many of these funders as is appropriate and it would help if I could spell. But you get the idea, the same thing again, and I could just keep adding funders until I add all of my information. Now, since this is an actually funded research, I'm gonna not save that. I also have here because I work for Center for Open Science, we're affiliated obviously in the OSF. Since I don't teach this class for Center for Open Science though, I'm gonna actually remove that affiliation from this resource. So I have the choice of having it affiliated or not. Things like date created and modified I can't edit because those are being recorded as I make changes in the system. And finally, for some additional information, I can add keyword tags. So I'll add metadata to this. And if you remember in the resource types, there wasn't a specific category for a curriculum, but I'm gonna add curriculum now so that that is part of my metadata. So we have most of the metadata here on the same tab. The only things you can edit here are the title, which you can edit on the homepage. Pardon me, one second, here we go. Okay, so I also want to show that in addition to things like registrations and project and project components, we can actually file down and actually edit metadata at the file level. So if I go and look at the files in this project, I currently don't actually have any files in the OSF storage for this component, but I do have it in some Google Drive files. So I can add metadata to both OSF materials and materials from my add-ons. So the metadata is actually stored in OSF, so it doesn't do anything to change your file and it's add-on storage location, but it does add the metadata to the OSF information representation for that document. So if I go ahead and open up, this is the blog post that I write for the unit on metadata for the class and it's in particular, it's about artificial information. So if I want to, I can now add a title, blog post for unit four, and I can, for the description, I think I'll just add some terms here, but if I wanted to spend more time on that, I could create something that's more like a small paragraph. And again, I have my resource types here. And again, for this, I'm just gonna choose a generic of text and I will add English to it and save. And now we have some more specific metadata, particularly about this particular item. So you can see that I'm only adding the metadata for the things that are unique to this specific file. The other information that comes from the rest of the component that this is from unit four on metadata, that is from this digital libraries course and it has this attribution or license is all inherited into the file. So you don't have to re-input all of this information. You only, you have the option of just adding the information that's truly unique to this item. So this has been kind of a quick demo of some of these new features. If you want to go in and start adding this information, which I highly encourage, we do have in our OSF support, which is available everywhere in OSF from the header. We do have some new guides with instructions and screenshots on how to add the new metadata and the new metadata features. They are under account and security where the rest of our data management information is and there's a new section on metadata. And we have four different guides right now adding metadata to your project, to your registration, to a file, and then a page that describes in some more depth what each of those resource types are. If we take a look at one of them, you can see we've got step-by-step instructions, lots of screenshots with pictures and arrows showing you exactly how to do that. So I'm going to jump back over to the slides and I want to talk about a few more things before we have time for some questions or some more demo if anybody would like to see some other features. So that again is the help guides, which are available. So once you've gone in and created this metadata, I want to give you a preview of the kinds of immediate impacts you can see from this. So this is an example of an OSF project that was updated only three weeks ago when these new fields were introduced with the new metadata fields. So looking closely at the record, we can see that the contributor there, if we were to click through, that contributor has been associated with an ORCID ID. The various fields that use the data site vocabulary have been filled in and the funder has been added and that was added using that crossref funder registry. So now if we go and search data site comments for this particular project, what we find is that already that data has been now populated in that record over at data site. So we can see the same thing, the data site fields, if we were to click through, we would see the ORCID identification and the crossref identification for that material. So within three weeks, that's already populated elsewhere outside of OSF and is findable in more places. Additionally, it also shows up on this contributor's ORCID record. So in the ORCID system, this is listed as the newest work there. So what that means is that the fair metadata that we're creating in OSF is making that data interoperable with other systems and that's increasing the spread of our information and it's also helping it to meet the requirements, not just of OSF, but of all of these other systems as well. So I wanna wrap up by talking a little bit about what we're gonna work on next related to metadata. So a lot of this project up to now involves creating backend infrastructure to enable kind of all of these features to be put in place and smoothing out and normalizing all of the metadata that's used across all of the different OSF objects. The next project we're gonna be working on and indeed have already started are improvements to our search interface to make it easier to find specific materials based on these new metadata fields. So that data is already there and is already searchable. You can input any keyword terms for different funders and find hits with those terms. If you're familiar with the Lucene search syntax, you can actually do advanced searches that will search only in those new metadata fields. But we're soon gonna release a new search page with actual filters and lists that will allow you to do this type of advanced searching without having to understand that Lucene syntax. We're also gonna start working on expansions to the metadata that's available to meet the needs of different communities. So right now the metadata in OSF is really discipline agnostic. It's fairly generic. It doesn't contain specific fields or information and terminology that's useful in specific communities or contexts. So we're gonna be starting a project later this year looking at some tools created by a group called CEDR, which is the Center for Expanded Data and Retrieval at Stanford. And they allow for the creation of community-developed metadata templates that could be applied to OSF materials. So CEDR tools allow communities to dictate exactly what metadata fields and terminology they wanna apply to their research objects. So adding these templates to the OSF will increase the richness of metadata tied to your work and will expand the findability and reusability of your work. So just to show some examples, CEDR makes it easy to create templates like this, which is created for biological samples. You can see there's really specific terms in here like strain or organism name. On the other hand, this template also created by CEDR gathers metadata about animal observation studies and has things like organisms named of the observed animal and the number of animals studied. So we've already started some work on gathering some requirements around this and we should be able to implement these tools by the end of the year. But in the meantime, I'll end with a request that you go out and start adding this data to your OSF projects, registrations and files, both the existing ones and brand new ones as you create them. Hopefully I've shown you that it's pretty easy to do, but that is very beneficial to you as well. And you can start seeing the results of that work as soon as possible. So with that, I'll say thanks. Here are some various links and addresses for support or questions or more information. We definitely have plenty of time if there are any questions. And it looks like there are. Mark, should I just walk through them or do you want to? I can go ahead and ask them out aloud. Okay, yes. So I'm just going chronological order is is the DOI capability within OSF only for components or can individual files be DOI within OSF? At the moment, DOI is only available for components for projects, registrations and preprints. DOI's for files is something that is a possibility in the future, but we're not there yet. If we did, we would have to require some certain pieces of metadata to allow you to mint DOI's for particular files. Thank you. Our next question is, can you explain what a registration is in OSF terms? I wonder if Mark, I should ask you to explain this. Registrations, let's go to the OSF registries. So registrations are a format of data where you can document your research study to declare what your research questions are, what your methods are, what your materials, what your hypothesis is, kind of tell the story of your study. And when you create the registration and you finalize it, it is immutable. You can come back in and you can make some minor changes that will be logged as a change was made on X, Y, or Z date. But the registration allows you to go ahead and just publicly state in a trustworthy way, like so a way that you can't just come back and change as you do your research and find that it isn't meeting your expectations. That this is what you plan to do. So OSF registries are tied to OSF projects. So if you start with a registry, it'll immediately create a project space for you where you can actually do your collaborative work. But anything associated with the registry will be recorded in time as a public sort of declaration of what it is that you're doing. Mark, is that a good enough description? Is there anything you can add? You're the expert on registries. I think you looked out of the park without going into the details, thank you. Great. All right, our next question is about institutional affiliation and metadata. So if a user adds their institutional affiliation to the metadata for their project, but they were not logged in through their institution, would their project show up in the institutional project list? So the way that institutions work in OSF a little bit complex. So your institution has to be a member of OSF, first of all for that institutional affiliation to show up. So we have on OSF the list of institutions that are members. So if you're a member of one of these institutions and you sign into OSF one time, at least one time, using the institutional login, then from that point forward, all of your projects will by default be affiliated with your institution and then you can choose to take that affiliation off or not. Once you log in through your institutional affiliation, if you already had say an existing OSF account through Gmail or something like that, you can merge those together. So that whether you log in through your Gmail or whether you log in through the institution, you'll still be all in the same account and that will all be affiliated with the institution. But the easiest way to make sure that it is is to sign in using the institutional affiliation, which just goes out to your institution single sign-on authentication service. If you have that, if you don't, it's by using the institutional email. And then once that account can have multiple emails associated with it, then you have the choice of sort of how you log in. But once you're affiliated with the institution, that affiliation stays there by default, but you can remove it from anything that you'd wanna remove it from. Some thank you and thank you Lucy for that question. Is it possible to search OSF by metadata field? Yes, so we don't have a guide for this up yet. I have the text of it written as a specific, where am I, sorry, I can't see everything on my screen. I can't see, sorry, okay, there we go. So this guide should be coming out pretty soon. But if I wanted to look for, let's say, Thunder name, National Science, excuse me. Okay, hold on. I gotta put some quotation marks around that. There we go. So we have four items so far that have identified the Thunder as the National Science Foundation. So there is some specific syntax you can put in for a couple of those fields, Thunder name, resource type, general affiliated institutions. I've written a guide for this that has been shared with some of our members to have it kind of test it out. Sorry, I lost my speaker. Okay, and we'll be adding that to the guides pretty soon. So in the interim, if you want to go in and learn how to do the advanced searching syntax, then you can search by these fields already. But if not, you can still search by National Science Foundation. And I get a lot more, but since this is in quotations that should be mentioned anywhere in the text. So this is probably picking up things that have that in that specific Thunder name field, but also have it in things like descriptions and in the description of a preprint and things like that. So the search updates we've already started working on those should be out within a few months and that will allow you to do this searching much, much more easily. Thank you. I am bouncing around now at this point because some of these things have themes. I'm gonna return back to your institutional affiliation question, which is, do you have any, well actually, I'm gonna change it up. I'm sorry, do you have any demo tips for metadata to link different parts of the project and their metadata? I'm not sure I totally understand the question. I would say one thing to keep in mind, like the thing that is nice about this is that if we're looking at a project, like I showed earlier, the metadata that I have here at the project level, if I go in and I look at the files, that information is immediately associated with all of these files to start with. So the nice thing about OSF and its structure is that if things are organized within projects or registries, if projects and registries are related to each other, it automatically builds for you that web of connections. And when the records like these are exported out to systems like data site or other through the API, other kinds of sharing mechanisms, the links between those objects remain. So you'll get a record for the project and it will say, these are the child components, these are the files that are associated with it, these are the registrations that are associated with it. So when you start to add information, when you start to have metadata into OSF materials and those OSF materials are linked to each other, then all of this stuff really starts to gel and become a kind of interconnected web of material about your research. Thank you. How can we implement controlled vocabulary terms? So for example, LC subject headings, but in particular, PIDs of these vocabularies within OSF? That's a great question. So right now we're not supporting vocabulary like Library of Congress. You can certainly add Library of Congress terms in the keywords, in the tags of projects. And you could add the identifiers in there as well and they would be searchable data along with everything else. I think that's a great example for something that might be really useful in the future for a Cedar based template. So if we were to create a template, then for projects that do wanna use LC subject headings or another controlled vocabulary, I'm familiar with art and architecture, the Sorus and cultural heritage ones, which may not be as relevant. But we could create the ability to create templates that would then use those terms and would sync with them. Right now, since this is metadata for all of OSF, we're not really requiring any specific subject scheme. So really keywords or tags is the way to do that at this point. Thank you. Speaking of Cedar, when will the Cedar tool be implemented? Well, that's still very much in development. We're in the early stages, but our goal is to really do this by the end of the year. That's as much as I know right now. We've kind of started the early stages of starting to do some interviews and gather some requirements to figure out how exactly it can be folded into OSF in the most effective way. And our final question is from Evan. He is a past Pin West Clarion Digital Library student. Oh. And he was asking, do you have any advice for, sorry, go ahead. Sorry, I know you Evan, hello. Do you have any advice for presenting and advocating for the use of these new metadata fields for information professionals working at scientific research institutions? That's a great question. Thank you for asking. Well, I think that probably the most effective, the most persuasive arguments are gonna be the things about the mandates that are gonna start requiring this information, that if you can go in and put in the actual, cross-ref registry funder number for NSF or NIH and your award title, then when NSF or NIH is checking to see that you did what you're supposed to do, they can find it right away because you've used that term and we will be creating the way for them to search by that. I think another really persuasive, hopefully, way is showing how like I showed with the example where metadata is created in OSF and then that metadata gets picked up into data site and gets picked up into ORCID and that creating the data in one place populates it out into the web, into all of these other places. So I know that a lot of times to researchers, it can seem like an extra step. We do wanna do some improvements to the UI to make it a little bit more seamless of an experience, but I think showing them the proof of the good that it does is probably the best way to advocate for taking the time to create this information. All right, that was our last question. Great. Well, thank you, everybody. We appreciate your attending today. I'll show this again. If you have any questions, you can contact myself or if it's more general, our support at OSF website. There's other places where you can find out more about OSF and Center for Open Science. We encourage you to create an OSF account and we encourage you to go in and start adding metadata to all of your materials that you have in OSF. So I guess that's it. Thank you all for attending and we'll talk to you later. Bye.