 Project coordination for several grant-funded projects, one of them is to recruit medical school students into the master's programs at the School of Information and Library Science. But many things and her master's degree in library science, she holds that and it's from Florida State University. So now Peggy Schaefer is almost ready to give us her talk and tell us more about the right of data to publish. Do I need to hold this? Okay. I'm not sure. You can try it. Thanks. Thank you. No, I'll do this. This is fine. Thank you very much. Thanks so much for letting me talk about Dryad. It's one of my favorite things to do. And I want to tell you today a little bit about Dryad, about the joint data archiving policy, affectionately known as JDAP, and how Dryad works with journals and publishers, how librarians can use Dryad a little bit, and some ideas for ways that librarians might support research data management in publication. So first let me start. What is Dryad? Well, Dryad is a repository that holds data associated with scientific publications. These are mostly journal articles, but also we have data associated with thesis work and a book chapter and a book. This is a small piece we recognize of the research data life cycle that you've heard about. It initially had a scope in evolutionary biology and ecology, but we have now added many other fields. You'll find medical data, public health data, paleontology. Indeed, our scope now includes any scholarly publication. Yes, that does include the humanities. So the humanities data does have a place to go. We also have recently added something from computational linguistics. So it's a very diverse data repository. So let me just read the content policy. It's only about a sentence. Dryad is an international repository of data underlying peer-reviewed scientific and medical literature, particularly data for which no specialized repository exists. So we are aiming toward the long tail of research data. And what makes Dryad unique, furthermore, is that all the material in Dryad is associated with a scholarly publication. You cannot submit your data to Dryad unless you have an accepted or an in-review publication to associate it with. So we preserve in one place all the data supporting the conclusions of the article, all the data necessary to replicate the findings in the article. We also have, as I said, a very diverse body of data. And for example, we have things on the amount of force used by a shark to bite different kinds of prey. And perhaps my favorite data set is about how people perceive simulated monsters. So if you're able to get on the Dryad website and type monsters into the search box, you'll see what I mean. We welcome data in any format. It makes it easier, of course, to take all formats if you want to preserve all the data underlying a single article. We would, of course, prefer non-proprietary formats. And yes, we do take PDFs, although I didn't include it in the Word Cloud. We would rather authors did not submit data as a PDF, and we do encourage them to use reusable formats. So we've got movies. We've got movies of battling fiddler crabs. And we've got animated 3D dinosaurs and sound files like bird calls. And yes, we have software if it doesn't have a better home somewhere else. All the data in Dryad is released under the CC0 waiver. I'm not going to talk about CC0, but that's a way in which you can make the data maximally reusable for humans and machines. There are no barriers or no restrictions into its reuse. So let me tell you a little about the joint data archiving policy. This was a policy that emerged along with Dryad about 2009 from discussions among a few journal editors, primarily in ecology and evolutionary biology, who were acutely aware of the data that was being lost as the articles were being published, but without the underlying data being saved. So they wrote this policy and about 2009 and coordinated its adoption in 2010 and 2011. And many journals have signed on to this now and continue to do so. Excuse me. We want to support replicable science, the data supporting all the results in the paper, but not all of your data. We do, as you see, allow for embargoes and exceptions for sensitive data. This policy can serve as a model for other disciplines and other journals are welcome to adopt or adapt JDAP. So let me just show you quickly our new home page. If you haven't looked at Dryad in the last few months, take another look. We have redesigned the web page. These stats are from late April. There are now 223 journals with data in Dryad and over 12,000 authors who've deposited to Dryad. We have 32 integrated journal partners. These are journals who work with us behind the scenes as they are processing the manuscripts under review. They tell us about their manuscript and we prepare and pre-populate a record in Dryad to make the author's submission of data easier. The author doesn't then have to retype the article and author names. So here's what a data package looks like in Dryad. This is the landing page for an article from the Journal of Ecology and Evolution. And we display the cover for our partner journals and clicking on that cover allows the user to go directly to the article on the journal website. This lists all the files associated with an article. In this case, there are two. And at the time I, this is an article from August 2012, which had been, one file, and if you can see that, one file had been downloaded 61 times and another file 33 times. You can also see that the top file has a readme file. The author has provided some information, perhaps some context about how the data were collected or some instructions for its reuse in a separate file. This is another data file in Dryad. This one's from 2009 and it has been downloaded over 4,000 times. We think some of this reuses from data in the classroom. We think that people are assigning students to work with this data. Journals benefit when data is reused. And here are some recent stats from about a month ago showing for our most, the journals with the most data packages in Dryad, these are all among our integrated partner journals, how many downloads and how much use those data files have gotten. So what makes Dryad unique? Here are a few things. It's the close association of data deposition with the process and business of scholarly publishing and using article publishing as a model for how researchers can benefit from data sharing. Also, the article metadata becomes the foundation of the metadata for the associated data. Data files in Dryad are curated. I'm sure you'll be glad to know they are curated by real-life librarians and information science students at the University of North Carolina in Chapel Hill. So we are an open data, open source, and an open organization. Dryad is also a nonprofit organization. We are a new nonprofit organization comprised of members. Members can be journals, publishers, libraries, data found, data producers, funders, et cetera. And there's more information on the website. So here are some ideas about how librarians can use Dryad. I offer you Dryad to use as you wish. You can, of course, ask scientists about their data. I think this is a natural first step. And don't wait for them to ask. Ask them where is it? What are your plans for preserving it and making it available? And make sure they know your data is important, and it deserves to be preserved. I'm not sure everyone always knows that. And you can suggest that they can archive their data in a public repository like Dryad. You can show them, you can use Dryad to demo how many times data in their field, say some other article in a journal that's familiar to them, how many times that data has been downloaded. Very often our authors are surprised when they deposit data in Dryad and come back a few weeks later and they think, who are those 65 people who downloaded my data set? I didn't think anyone was interested. So I think it's an interesting experience for people to be able to look at that, just look at those numbers. And here are some other scenarios that I think could be useful. We think younger scientists can gain additional visibility for work, for their work by making their data files available, incitable, by adding, for example, a section to their CVs for their data publications. Another scenario we're familiar with is when a research team, perhaps one that's been in existence for a while, decides to archive all of its data in Dryad. We had one case of this when over the weekend we suddenly discovered one author had archived 17 data files associated with 17 articles. And it turned out it was the work of a multinational team of people who've been working for many years on tunicates. These are small, sponge-like animals. And they decided, well, let's just put all the data in Dryad. And then when anyone asks us for it, we can just point them to Dryad, and we don't need to be bothered with those requests any longer. And here's another idea. You can deposit your data in Dryad. Yes, we have information science data in Dryad. Some of you might be familiar with Heather Pivovar and the work she's done on data citations and the impact of them on article citations. And Heather Pivovar's data is in Dryad. Data in Dryad can be deposited for free at the moment. In September, we will start charging. We'll be implementing some modest data deposit fees in order to maintain ourselves as a sustainable organization. And there's more information about this on the website. We also recommend that you talk to authors about the advantages of data sharing. These may not always be obvious. So let me say this is what we say to authors. We say visibility, making your data available online and linking it back to the publication provides a new pathway for others to learn about your work. And there's a correlation between the availability of a data set linked to an article in that article citation rank. Making it citable when you deposit in Dryad, as in other repositories, you receive a DOI. You might need to explain to researchers why a DOI is better than a URL for pointing people to their data. It can be reused. Others can cite you and it will gain you academic credit. We say, we mentioned workload reduction. If you put it in Dryad, you can simply direct requests for the data to Dryad. We will preserve it. Your data files will be permanently and safely archived in perpetuity. We'll keep your files intact. We migrate them to new formats as old formats become obsolete so you won't have to worry about whether Excel 2003 files will open in Excel 2023. An impact, of course, everyone wants impact. You will garner citations through the reuse of your data and you can monitor that through Dryad's statistics. You could also gain opportunities for new collaborations. And here are some other ideas on ways to support scientific data management. One is to consult and share best practice guidelines. And here are two that we link to in Dryad. There may be others that I'm not aware of or maybe you feel the need that there needs to be a data best practice guidelines for the data in your field. And I would encourage you then to create one and let us know about it. Another idea that I feel is important is to use and promote good data citations with DOIs. And as we all are aware, I think data citation conventions are evolving. But I think authors, journals and publishers all need to see good models of data citation. These can be in articles, in CVs, in grant proposals. So let's all help to make data citation familiar. Many journals don't have policies yet about how to cite data. So we're in early days yet. And if an author is depositing data in Dryad or another repository and wants to refer to it in their article, it's often up to the author to make sure that that DOI is included in the journal article. We also encourage that people consider using DOIs in social media. With just more and more CV type information online in LinkedIn or in Twitter, with DOIs become more familiar, I think that practice will improve. Here's Dryad's citation philosophy. We urge people to cite both the article and the data. But to cite the data at the data package level, the data package is all the files pertinent to a certain article. So here's another example. Here's an example I showed you earlier. This is also noteworthy. This is the one that was downloaded over 4,000 times. But perhaps some of you can see that the list of authors is different for the article as for the data set. And this is at the author's request to reflect the differing contributions of people to the creation of the data. So perhaps, and this is another incentive, that an independent data citation can provide a reward for members of collaborations who want to get primary credit for their contributions to the data. So this is how we recommend that authors and journals cite data in Dryad, but are they doing so? So there are two cases here, of course. There's the time when you cite the data, when the data is that underlying the present article, and there are times when you cite data that you have reused. We've looked recently, well, here's an example that's not so good. I don't like to point fingers, but this is buried in the references and acknowledgments with no DOI or URL. Very much doubt that anyone reading this could find the data associated with this article. This is an old example, to be fair, from a few years ago. We recently studied citations to data in the data sharing article for 338 articles from our partner journals that had been published in the last year. So the good news is, at least 75% of them did include a data citation with a DOI, and they were all over the article, some in a special section, some in the article header, some in the methods or acknowledgments, and some in the references. So there's no consistency furthermore between journals and how they cite the data. Standards are still evolving and journals aren't sure what to do with these data citations, so I say watch this space. Some more ideas for ways that libraries and librarians can be involved here would be scientific data positions at academic libraries. We're seeing this some in the states where people are getting titles such as a research data coordinator in academic libraries. There are scholarly communications officers that are often librarians, and all kinds of collaborative roles where librarians existing collaboration can be a starting point for data preservation and sharing work. And I feel these are natural roles for us as librarians. We're already involved with researchers at the right time and in the right place. We have the expertise and we have the opportunity. I think we should use it. And new opportunities are rising all the time in emerging roles and I'll be interested to hear from anyone now or later what other ideas you might have about emerging roles for librarians here. And someone said last week at the data publication meetings in Oxford libraries are old but data sharing is not. I thought I agreed with that at first but now I'm not so sure actually. So that's what I had to say about Dryad and I welcome your inquiries, get in touch by any of these methods, follow us on Twitter, and I look forward to hearing from you. Thanks. Thank you. Are there any questions? My name is Lukas Koster University of Amsterdam Library. I have a couple of questions. I'm not sure if that's allowed. So this is very interesting. I've seen the links between the datasets and articles and vice versa as well. But I have a question. I'm not at, I'm not, I work for a library but I'm not a librarian. Also I'm not really a researcher. I can imagine that an article that's based, is based on a dataset or a set of datasets but also other articles can be based on the same dataset. Do you kind of have a provision for that in Dryad without having to duplicate the dataset deposit? I think that with our way our process works an author who was reusing a dataset a second time say they published once from it and then they come back and they've got some more findings. They would probably re-deposit that dataset. It might be a little different and we do enable versioning. We freeze the dataset. So say you deposited it this year and then next year you worked more and you would deposit the data again. It might have new data, new fields, new columns, whatever. That would be a different dataset to us. But it could be exactly the same dataset that has been used as a source for another member of the project team to write another article. Right. And we would, in our metadata, we would be able to refer to that and this is also true if the dataset lives in an institutional repository as is sometimes the case. We would make sure that we would be able to identify this as a unique item. So there wouldn't be confusion. Is this the same as that? I'm always surprised that there is this direct link in this whole research and data management world between an article and the data. Because in my mind, I think there is a research project which has a number of people involved and organizations and there are a number of outputs, inputs and outputs. Are you thinking of kind of adding links to project information if it's available in for instance in local research information systems which then could give you the link to everything associated with that. The data sets every other output. I'm not sure that we would incorporate that but that sounds like the perfect thing to put in a readme file. So if somebody in the readme file says this data was collected as part of a long running collaboration, you may wish to, I've seen this before, consult our website or contact Joe Schmoe if you need more information. Okay, that's fine if you're looking at a human. Right. But we, I work with library systems and making available articles, metadata about articles, links to the full text and I'm currently involved in a kind of pilot to be able to see if I can link articles using identifiers if there are two project research information data sets and et cetera. So we actually need, in our business, we need the machine readable metadata also from the articles. Right. So here's an idea. We have on our website under the contact us page, we have on ideas forum where we welcome these suggestions. You can put your idea there and other people can vote for it, or you might find some other ideas that you'd like to support. So I look forward to hearing from you. Thanks. Time for more questions. Are there any more? Thanks, thanks, Peggy. I have a question about your, you said you're going to start charging and how you develop this policy, what this will mean for libraries and where will the funds come from project funding? Right. Exactly. Well, that is the big question I have to be saying to say. We've worked since 2009 to create a sustainable business plan for Dryad because no one, no one thinks that a funding organization like the National Science Foundation is going to fund operation like this in perpetuity. To be stable, we knew from the outset we needed to solve this problem about sustainability in a business plan. So back in 2010, we authorized two different consulting firms. We took the results of their work. We have a board who worked through that. We came up with a tentative business plan. It very early on, it ruled out some things. For example, we're not going to charge for data use. Access will always be free. We need to charge at the point of ingest. That's when the costs are incurred and so on. And we made calculations on the scale. How much will we need to ingest and curate to be sustainable at various different fees. So this has been in the works for several years and there's actually much more detail I could give you about this. But maybe what's most important is to know how is it going to work. So it's going to work in starting in September. There'll be deposit fees. They can be paid by any organization. A university, for example, could buy vouchers. They could buy a thousand vouchers. They could buy 50 vouchers. And they could give those vouchers out to researchers that would enable them to deposit data for free. Funding organizations can buy vouchers or can support the use of grant funds to pay the deposit fees. There are waivers for researchers who come from less developed countries. And then if none of these plans pertain and there's also a subscription plan for journals and publisher. If none of those plans are pertinent at the point of deposit then the author will pay an $80 fee. $80 per data deposit. That's $80 per data package at the point of depositing the data. Thank you. Okay. Moving on. We're coming at the last talk of this morning.