 and Burg Hart from the Coalition for Networked Information, C&I, and I'm really delighted that you chose to spend some time out of your day with us here today. This webinar is part of C&I's Spring 2020 Virtual Membership Meeting, and I'm really delighted to welcome our speakers today for this really interesting talk. We're going to hear about a partnership designed to support a more seamless workflow and one that will make it hopefully easier for the process for our researchers, speaking in terms of things like data curation, publishing, data hosting, etc. The title of today's webinar is Innovative Models in Data Publishing, an update from CDL, Dryad, and Zenodo. The presentation will be given by Daniela Lohenberg of the University of California, and along with us here today is Alex Ionides, who will be here to field any questions relating to Zenodo, and he comes to us from CERN, and I understand it's very late for him, so thank you so much, Alex, for being up with us during the wee hours. Before I hand it over to Daniela, I just want to draw your attention to a couple of details about this webinar environment. First off, if you look at the bottom of the screen, there's a little Q&A box. If you click on that, a box will pop up, and you can type your questions or comments in that box at any time, though we will field questions at the end of the webinar after the presentation has finished, and I'll come back and moderate those questions. Excuse me. I also want to call your attention to the chat box, again a little button at the bottom of your screen. We'll be chatting out some links and other information throughout the presentation, and if you want to use that chat box, you should feel free to do so as well. I'll be monitoring it throughout the Q&A to make sure that we address any questions that come in through the chat box or any comments as well. So without further ado, thank you again for coming. Thanks again to our speakers and Daniela over to you. Great. Thanks so much. Hi, everyone. For those who I don't know, I think I know some of you. I can see participants. My name is Daniela Lohenberg and I am the Product Manager for Dryad, but I'm based at California Digital Library. I focus on data publishing and data metrics in a couple of different realms, and I preface that because I'm kind of going to jump between a couple of my roles here, focusing on UC, then focusing on Dryad, and kind of back and forth there. And so, yeah, we have Alex here online today who's going to help answer questions, and let's kick it off. So taking a step back and looking at UC and data publishing in the last kind of decade, what we were noticing, it's important to take a step and look at what is happening at UC in terms of research. We all know that University of California is a massive institution, and here are some numbers to kind of prove how big it is that we should be expecting. And looking at this, we should see how much research data that we would expect to see. And so, starting around 2012, we had a data publishing platform that was aimed at being the infrastructure to support the amount of research data that we expect to come out of UC. And that was a publishing platform called Dash, and we had four FTE that were working on this, but we also were taking care of server maintenance, storage costs, and then on top of that, we were going back to all of the 10 UC campuses and saying that the preservation cost was going to be a recharge for each campus based on which campuses were depositing and what that looked like. And so, we were putting a lot of resources into this, and we kept trying to think of ways of how we could try to see more adoption in it, but really all the while, we were seeing a whole bunch of deposits from UC researchers going to general repositories, going to disciplinary and other, but every time I would ask a researcher, hey, have you heard of Dash? Do you want to deposit here? It's something we have at UC. They would say, oh, I actually go to Jyder's, Nodo, or Big Share other places. And so, a couple of years ago after CNI, I wrote a blog post kind of outlining why I thought that there was a problem with this. And really, I think the problem was that we were so focused internally on some of these issues like the infrastructure and the cost and the recharge that we really weren't focused on what researchers needs are. And so, some of those things that, you know, why it really inhibited us is there's four points to it. One is brand recognition. So, researchers go to deposit their data, where their funders, where their publishers are telling them to go. And an institutional focus repository is not something that's on those recommended repository lists. The second was that we were, again, coming back to that recharge. We didn't really have a strong library community advocating for the platform because there was kind of that we want deposits, but it's going to be tough to be able to make that cost for preservation. The third point is that as libraries and institutions, we really care about curation. But for UC-specific platform, we didn't really have curation at hand that we could apply to these data sets. So, it was really just a free deposit, self-deposit system. And the fourth and maybe the biggest is that we didn't have connections to the larger ecosystem. We know that researchers publish when it's very seamless for them. And so, we went to publishers and said, what if we build this integration or Jupyter notebooks? What if we build this integration where we could publish and reuse the data? And the response was always UC is big, but UC is not that big. We can't do it for every institution. And so, we thought, you know, around 2018, how do we go forward from here? Well, first, we could continue with the same approach, but we really have no reason to believe that deposits are going to go up. We could look in other open technology projects that a lot of us are involved in other platforms, but there would still be a cost and still no real change in deposits if it's still just investing in a library community. The third would be to look at a vended solution, you know, which is coming in a bundle, usually in other things that are being pitched to our campuses. But with those closed solutions with a high cost, it actually kind of conflicts with our values about open science and those priorities. And so, we thought, how can we think strategically and better align with researchers? And that's what brought us to partner with Dryad in 2018. We know that Dryad was researcher-led, supported over the last 10 years by, started by and supported by researchers. And so, we thought this may be our strongest alignment and how we can support both UC and researchers globally because that's how researchers are thinking. And so, in September of this last year, we relaunched Dryad onto a platform that we've co-developed between Dryad and CDL. And I don't raise this because it's a technology partnership, it's not. It's a community partnership. The only reason to bring up the technology is that there were a lot of features that we needed to be able to put in and work in an agile environment so that other communities and institutions and publishers and funders could better work with Dryad. And so, we're really excited about that. And it's actually what has been able to leverage this notice stuff that we're going to talk about in a bit. And so, in the first four months of 2020 of this year, we have already seen more data sets published from UC researchers than we had in Dash over the six years that Dash was really invested in. And so, we've seen more than 600 deposits in four months, which means that our alignment here is actually serving researchers better without increasing our resourcing. And that ROI is really that we don't have any new FTE that are required for it. We're not covering those storage costs. Of course, I am product manager for Dryad at UC, but we're not putting more effort into this. And what we're seeing in return is higher deposits and higher alignment with the research community being able to support. And I've looked at tons of examples. And when we looked at from UC campuses, the researchers that are depositing, the co-authors almost always are from an institution either in the US or globally. And so, it shows that we have to be thinking much bigger than our own institution to support researchers. And so, Dryad was always supported by publishers early on by societies. But what's really allowed us to do this and what we're really proud of is that in the last year that we've launched this new membership model, we've seen 27 new institutions join the Dryad community. And we see not only from these ones highlighted here, like big state institutions, Ivy Leagues, large private schools, but we've also seen small liberal arts colleges and others that really just want an easy solution for being able to connect in. And that's really what we're looking at. We're not trying to think about this as a one-stop solution. We don't believe there ever will be that, but it's more that we're able to make these connections and they're rooted in a couple of principles that I want to go through quickly. The first is that this community is about supporting researchers. And so, right now we know that we need to evolve with a lot of needs that are happening in the current pandemic. And so, it's great to have this platform, Dryad, that is at the ready, already connected to the publishers, the preprints and the funders who are supporting the rest of the outputs on this COVID research. And so, it really lowers that barrier and just allows the institutions to connect at that point. The second is that we're rooted in aligning with researchers. Every day we get tons of tweets of researchers super proud of their work in Dryad and how people are reusing their work in Dryad. And so, it kind of, you know, lowers that barrier of how much do we have to advocate and teach for people to go here because it's already something that's rooted within the community, within researchers. But the important part is that at this point we can maintain and leverage library values. And so, we're working with Columbia, for example, on how Columbia can expose anything that's happening in Dryad in their academic commons. And so, this way they can capture things from their researchers without having to put in the extra effort of building and maintaining a totally different system for data. And I think part of the biggest, like I said, which was a challenge earlier is just being able to amplify the voice of institutional values in the larger community. And so, having funder members like Chan Zuckerberg and connecting them to institutions, knowing that because of the scale of Dryad, we're often invited to the NIH, to NSF, to other funders to speak. That is a point where Dryad can actually be the channel for voicing a lot of the things that we're talking about here at CNI within our own institutions and a lot of those principles where, as one-offs, we're not always invited to those conversations. And so, thinking about how all of this at scale, how do we better support researchers and look beyond data? And so, we know that software is a huge piece of this. And in 2019, we were really excited to partner with Sonodo. And so, for those of you who aren't familiar, Sonodo is one of the largest general repositories. They're based at CERN and we have Alex on the call who is the service manager for Sonodo. And they have really invested heavily in integrations with GitHub, supporting software citation, leading the way in that realm, and also just being an open source space for taking all types of outputs. And they're hugely popular in Europe and especially within the communities there. And so, I bring this up again in the terms of being a channel, because at Dryad's scale, we're able to make these like-minded partnerships and then kind of grow that community that we're building up here. And so, before the pandemic happened, we were lucky enough earlier this year with Sloan Foundation funding to bring our development teams together and think about really brainstorm together Sonodo and Dryad. How can we bring value to non-traditional research outputs, leveraging curation at Dryad, leveraging software citation at Sonodo, without increasing the burden for researchers? We already know it's really difficult for them to be able to choose where should I go? Do I just choose alphabetically the top from a repository list? And so, what we've come up with and what we're building towards right now is a triage where we can make sure that data and software are each, data is getting curated, software is getting its proper citation. And the most important part is that because we're driven by our institutional and researcher communities, we can make sure that we're putting our best practices forward. And it's really an education time for researchers where we're guiding them along the way. And so, one way to look at this is thinking about from the journal publishing perspective, we know that research data does not need to be related to an article, but it is the most often case that we see. So, when you go to publish your article, you're asked for your code for supplementary information. You're also asked about a data availability statement. And so, because Dryad has relationships with so many publishers and because we are very excited, we'll be releasing an integration with editorial manager that houses about 10,000 journals. We can use this as a triage point to make sure that code supplementary information is actually going to Zenodo. And that data can get curated at Dryad. And together, we can bundle up those identifiers, relate those works together, and make sure that we're optimizing discovery for this, both the metadata and the files, while teaching researchers along the way about why it's important for the right license for software, why it's important that data is curated, et cetera. And so, what we're really thinking about is if we want to achieve best practice, et cetera, and think about how we can see mass adoption of open data publishing, open software publishing, infusing institutional values, we have to be thinking at the global scale because that's how research works. And so, part of that is making things as seamless as possible. And I put together this kind of illustration of it, but I don't want to get stuck on the idea of Dryad and Zenodo because there are so many players that we can imagine can come in here and be a part of this ecosystem. I've more put this together to kind of illustrate how we can leverage and interoperate together to make sure that we are very much connected to researchers, but very much rooted in institutions. And so, making sure that institutional repositories and domain repositories are not seen as competitive to general solutions, but rather we're complementary, making sure that things that are coming into Dryad that should go to a disciplinary home are going there and that we're not housing those for metadata reasons and others, and making sure that institutions that are connected in are able to curate the deposits if they would like to for their institution or that we're sending copies to their IR and just kind of building this more connected ecosystem that's rooted in how we can actually reach the researcher. And so I've just run through a whole bunch of information. I wanted to do that quick so we could have time to ask Alex and I any questions you may have. And I also realized because we can't see each other's names, etc. If you have questions and want to follow up after my email address is my first name, last name, at ucop.edu, but my Twitter information is here. So questions that you may have. That was great, Daniela. Thank you. Thanks so much for sharing all of that information about kind of the history behind Dryad and CDL and now this new partnership with Zenodo. Really interesting. So I see we already have a question and I just want to remind everyone please type your questions into the Q&A box as Sheila Rehman has done. And Sheila's question is can you talk about how ORCID plays into the Dryad ecosystem? Yes, I would be happy to. Dryad was the first repository that has required an ORCID. So as of our relaunch in September, every single lead author is required to have an ORCID to log in. We use single sign on for institutional members as a second step past ORCID. But we think it's really important that we have identifiers for everything, standard space, ROAR for institutions, for example. And we know that Zenodo is working to include ORCIDs for all authors as well. So when we pass that information back and forth, we can use those verified ORCIDs as well. Great. Terrific question, Sheila. Thanks so much. Another indication of how these pieces all sort of fit together and complement each other. So once again, any other questions, please type them into the Q&A or feel free to type them into the chat box. So Sheila brought up a really interesting point about identifiers. And Daniela, you were also talking about other pieces that could fit into this ecosystem, repositories. I just wonder if there are any plans to make, you know, what other things can you think of that might make a potential good partnerships down the road? If you're at liberty to talk about it. Or maybe speculate, I don't know. Yeah, that was a good question. And I'll open it to Alex as well as he has thoughts after. We're trying to really focus on reuse. So I know I just spent a lot of time talking about the submission and integrating with publishers and making sure it's a seamless workload to deposit. But a really big reason why we're doing this, like why do we even care that we're going to parse out software from data? And it's because we want the right metadata, the right licensing, we want to make sure things are reusable. So actually a lot of partnerships that we're looking at right now are how can we work with the reuse communities, especially those in R and Python and other people with electronic lab notebooks, how can we make more of these touchpoints where it's not just this focus on submission, but a focus on using. Alex, do you have anything you want to add to that? Another thing is that I believe that software really hasn't been represented, hasn't been a first class citizen in the research world. So it's not only the software that usually gets published along with, let's say, an article, but it's also, for example, the dependencies. It's also, there's a big graph of research that usually doesn't receive credit. And we're also struggling very much to feed into this graph and then to allow people to be able to cite things using the UIs in the proper way. Interesting. Thank you. Thanks so much. I'd like to also invite our participants, if you would like to join the conversation live, if you have a comment to make or if you'd like to ask a question live, if you raise your hand, I can unmute you and you can join the conversation here live and just share with us your thoughts, your experiences or any of your questions live, if that's something that you would like to do. I have to say I was really struck by the statistic that you shared about the uptick in usage after you launched the system. And this of course is prior to the Zanotto integration, but was there a lot of outreach that had to be done in order to get there or was it already so well known that it was an automatic uptake? I think it was actually very similar numbers to what it was before we launched. And it's just that it dried was known within a lot of the research communities. What's been more interesting, I think, is that it's a lot of people associate dried with earth sciences, ecology, you know, it's a tree as below go, and that's what we were rooted in. We're seeing a lot more bio men. And that could be in light of the pandemic, it could be in light of other things. But for instance, ECSF has put more effort into doing outreach. We're seeing a bunch of ECSF deposits and dried really wasn't a community for them before. I don't know if there's any UC librarians on the call that could also weigh in. Anybody? Interesting. Okay. That's great. Thanks. I just wanted to take this opportunity while we're still waiting for any other questions that might be out there. We have plenty of time for questions. So please just feel free to type those in. And I also just wanted to share with all of the attendees on the call today, the URL for a report that CNI put out just last night. What happens to the research continuity and future? I'm sorry, what happens to the continuity and future of the research enterprise? And this is a report on a series of executive roundtables that we held during this spring 2020 meeting having to do with the crisis and our concerns about whether or not and what kind of attention was being paid to the research enterprise in higher ed. So if you haven't had a chance to check out that report, please click on that link and have a look and let us know what you think. And we hope it will be helpful. Daniela, Alex, I really want to thank you for joining us and sharing with us a little bit about this new chapter in your story. And we look forward to hearing more updates about the project. Thanks so much and take care. Bye.