 I'm Danielle Lomburg and I'm at the California Digital Library and I'm also working as a product manager for Dryad My name is John Chidaki. I also work at California Digital Library And we are in one of the programs there at CDL that we call UC3 University of California curation Center that's focused on data and preservation topics We I know in the book it says that we were talking about CDL and Dryad today and the partnership that we've entered We wanted to bring it back and kind of talk about the story of why CDL has even entered into this partnership and where we've been in the last year in regards to data publishing and our values there So we want to start off and talk about a little bit of a self-assessment that we've been doing in the last year So last year at CNI I was here talking about data publishing and adoption And where the adoption was because we're spending time all talking about this But we weren't really seen adoption at the institutional level. Is that something that you guys are involved in? How many of you guys are involved in data services specifically at the institution a lot? Yeah, how many people were here last year at CNI? Did any of you see the talk that we did last year? Yeah So This was specifically Looking at what the researcher needs aren't if we were aligned or misaligned with it What we're talking about here in the CNI community and larger in the research stakeholder community And what we learned are a couple things here is that tools that we're talking about a lot aren't really researcher centric So we're not really using the researcher as the main point when we're building out these tools And that sometimes when we're thinking about this community We're thinking about what our best practices for us, but that can be misaligned with what a researcher's values actually are Another point about that is that when we're focusing on tools and talking about these technology Communities and which ones we align with sometimes that's distracting us from actually talking to researchers and figuring out what their needs are And then driving the adoption and figuring out what makes sense for them But a big one that that came out of this is we ourselves had a data publishing platform at UC called dash And that was for all of the University of California And when I would go talk to researchers there and say you should deposit to dash we have all these cool features You should do it They would always say oh well I'm just going to go to fig share or dry add is that okay And my response is always yeah, because I just want you to publish your data But what we recognized is that researchers are not driven by institutional policies, especially because a lot don't have data policies But researchers think at the domain level what are my collaborators doing not specifically to I'm going to publish my data in this UC place my collaborator in Germany is just going to be involved in that And so the biggest takeaway that we found is that we have to meet researchers where they're at within their workflows And the biggest one we know right now is that researchers are driven by article Publishing and so we try to go to all of the big journals saying integrate with dash UC is a massive Institution then we can have it within the workflows and that's not sustainable right publishers aren't going to think specifically at the institutional level as well So we wanted to look Specifically at our own success metrics. We were thinking okay, so we've been working on dash for a while We spend a lot of resources here talking about data publishing at the UC campuses Let's look at a couple of success metrics one number of deposits So over maybe the five years that we had dash going we had about 500 deposits Which in the scheme of things is nothing compared to what's coming out of UC Is that a problem that you guys does anyone have more than 500 at their institutional? How many how many people that run data repositories have over 500 deposits and their and and their So we this is like you know the the little secret of like all the resources We're putting into these projects is that the adoption the number I mean we know there's quality over quantity of course But the number the scale of what we're getting when it comes to adoption is very very small at all of our institutions Very very small. So at UC we were saying even with 500 deposits. That's just not enough. It's still just a drop in the bucket So then we're like, okay, well not deposits. Maybe it starts with awareness So then we'll see deposits down the line. We go around to the campuses. We talk about this Awareness is a problem and it's not without trying because so many libraries were involved in supporting us I'm looking at Elizabeth from Santa Cruz who their staff had been trying for many years for adoption of this as well And it's just again coming back to that the researchers were not looking at the institution They already had their domain or were already involved in ficture another in another way The next integration into workflows like I said, we went to publishers and said let's do this We went to Jupiter notebooks. He said let's get into that realm And they said that's a really great idea and UC is a massive institution But we're not going to do that on an institutional level And so, you know at that point if we can't make it easier for researchers then we were saying we did not succeed And lastly the kind of wraps it together is that we couldn't really come up with a good story for the value proposition at that point Researchers were not adopting it researchers didn't see that there was a clear value over going where their collaborators would go We weren't making we could put in new features But we weren't getting any integrations that were going to significantly lower the barrier to entry And so all this wasn't without trying But we just didn't feel like we were really successful and we had to be honest about that All the while that we're doing this commercialization has rapidly entered the research data space Here's a couple of examples here That we may be aware of spring nature picture Mendeley data Specifically, you know these costs are just not affordable. We know that Spring nature now says three hundred forty dollars for a data set just to go through a couple manual checks of the data set And this is still maybe lower quality than all of us are offering at our institutions and still not even in line with institutional values But we also know that researchers are going there Researchers go where the publishers are telling them to go And we know that all of these big services coming out are pretty affordable with library budgets Aside from all of that we as a community at the same time right now We have a colleague talking about community infrastructure for articles and the commercialization in the article space all of that is happening And we're all heavily invested in that area But if we don't focus on data, we're going to face the same problem that we're talking about for articles for data And it's going to happen really soon. So all of this is an issue that we had to take into consideration so we we also wanted to think about the the Framing of this as a community issue and thinking about what is needed from our community in the libraries within the researcher space within the research Data space and really thinking about best practices, so how many people here have read the Principles for open scarlet infrastructure article Okay, yeah, it's pretty influential article that came out by Jennifer Lynn Jeff builder and Cameron Nailin Just laying out some of the principles of good scarlet infrastructure projects To showcase what would be needed for us to as a community to be able to trust and have a good interaction with Infrastructure projects and know that they were keeping that community in mind Their their premise is this three-legged stool idea of governance sustainability and insurance So governance making sure that community projects are thought of And are architected by the community are led by the community and they have very specific ideas around how we should be enforcing those types of governance models within our infrastructure projects that we support Sustainability we need to know where the money is coming from and what the and strive not just for finding projects that are not for profit or Kind of blanket terms like that but actually digging a little deeper and saying how does the project that or work that we are Relying on within these within these scarlet infrastructure projects. How are they? Going to be there long-term Explain it to me show me the numbers and also don't be a don't be scared of you know running Being able to make a little extra money for innovation investment Don't don't force people to starve their projects You know allow for there to be sustainability and innovation within scarlet infrastructure projects Also the idea around insurance insurance that the the project will be there that the assets that they're working on with us Are going to be there and these are all Principles that you know our our thought of to span all of the infrastructure projects that are within skull comms and They are you know things that we in the library world already espoused to and already are committed to but it's good It's a good reeve a good evaluation of the kind of what kind of projects We should be as a community here investing in and what their responsibilities are to us I'm kind of separate from that but in parallel to that a group of people including myself got together and thought okay We're talking a lot about infrastructure as technology, but we're not talking a lot about the people So you know that the challenge that we have in a lot of infrastructure projects that we get involved in or community Projects that we get involved in that we're not aligned at a people level like they're they're great projects But they're just not our people and so we started to say wow Well, what does that mean? What is this kind of generic term of who are our people and so we started to sit down and say What are the best principles for? individuals in this space What what what kind of values would we want them to have so that we can understand that we have a shared value and a Community can be based on those shared values And so we came up with a set of six and there's a book online that supporters guide you can read but just about How personally we should be thinking about being as transparent as possible We should make sure that we are using open infrastructures and we are working as much as possible In practice what we preach We have to think about how to be transparent not just when we're working outside of our spaces But also within our workplaces. We have to be transparent parent with our colleagues We have to be more transparent with each other about what we're doing We have to bring other people in it's fine to have public private or cross organizational partnerships We have to recognize and celebrate those differences. It's not just about Everybody working with similarly focused in people we want to build a bigger tent And we need to respect multiple solutions But we have to make sure that when we're working on our projects, we're sticking to our scope and we're thinking about what are we doing? You know in our case, it's like how are we driving adoption on data publishing? We can't boil the ocean we have to like when we're working on our projects We have to be able to stick to that scope and leveraging communal wisdom and it's but you know Also encouraging healthy skepticism, but really that you know, we're stronger when we collaborate together and these types of Values are what we would say bring and make good scarly communications projects Open infrastructure projects possible so, you know ours our Our project here the you know the portfolio of projects that started with dash and other Consulting services. It's really about effectively supporting the research community So that doesn't always mean that we would need to build something or we need to Create a new tool or we have to adopt in their tool It's really about looking at what is it that are the barriers to entry? What are the barriers for researchers to to publish their data and make it openly available and Really starting to focus more on that root problem instead of focusing so much on whether or not someone's using the right kind of markup Or someone's got the right kind of database structures It within our library IT departments, right? and so We wanted to look outwards and say what is What are projects that are similarly aligned have similar ethos are led by people within our community are? Focused on being as good in as possible when it comes to best practices for open scholarship What are places that are already growing adoption and where are places that are meeting our researchers where they're at and The logical choice in the place that we all know is dry at so We did it a full evaluation of what we knew we needed to be able to to be successful within our data publishing Projects at at UC and at CDL and we knew that dryad was doing a better job than we were on UC campuses, so how many people know that dryad? right, so Everybody's researchers in in this room publish their data within dryad every every Universities researchers are downloading data sets from dryad dryad is a community-run organization It's been for ten years run by researchers and with researchers It's integrated into a journal workflows. It's 450 journal workflows it is currently I'm not pushing I'm not pushing forward on my slides. I apologize You know over 90,000 researchers have used dryad We are we're talking about adoption. That's not in the hundreds or the tens. We're talking about thousands, right? We're talking about being able to meet a researcher where where they're at so that they actually have an incentive to use and to publish data And so if I don't people saw this but in May we made an announcement that we would Join forces with dryad to really help leverage their place within our community and the value that they're already giving us in this room and in the wider community. They are already helping us with curating and and Data sets they're already helping us with making sure there's quality data sets published They're already helping us with the awareness campaign And they are already helping us with the adoption problems that we have in the library world and in the institutional world And we said at CDL We're going to work with dryad to try to showcase this to the community and make sure that the whole community Understands and sees this value and so we have since may been working very closely with dryad on and and the research You know the researcher adoption of this was a major point for us because you know dried wasn't necessarily going on to the campuses And working with the researchers and yet still Maybe 12% of what's in dried was from UC and we got 500 within dash And it's just truly because of the way that they're set up the way they're owned by the research community that we We ourselves just can't get that embedded like someone that's already bought in like this And so that was a big driver of okay the values aligned, but also it's just already something that our researchers want Right, so we have been working since may excuse me to to launch a new dryad that is Moved on to a platform that can be more API based so we can accelerate even more integrations with more publishers we've been working across the teams to look at ways of repositioning the Community value statements that are made around dryad to showcase the kind of work that they're already doing with researchers across the world And allowing for more integrations with publishers But also more integrations with kind of things that we value within the institutional space right some more reporting for For institutions and more transparency for you and for your libraries and for institutions into what's going on with in dryad and Part of that is a membership model. So back to that kind of idea of governance. So dryad is a member led membership organization right now, but there aren't very many Institutions who actually pay into dryad as members right? We're leveraging them. We are using them They are helping with the adoption problem that we have within our communities and we're but we are not paying into that model And so if we follow the best practices of how community projects are to work We have to start paying in and being a part of the conversation and starting to think about how do we get our values and our ideas Into this conversation and helping dryad position itself as this is the community project that it is Yeah So that's kind of a background on why where we are but thinking about that I'm working on this project now Relaunching this new dryad service We have two guiding values and these go back to originally what we were talking about last year and what we've been thinking about all The time which is that one we need to be researcher centered So we know that there's a lot of people that have opinions and priorities for data publishing and our Primary goal here is that the user is the researcher and the researchers who we need to be thinking about when we're building So everything in dryad is a user tested user prioritized feature We know there's all these things out there But it has to be that researchers would definitely use this and this would help them with publishing their data And the second point to that is that we're adoption focused So there may be you know, there's a lot of goals There's a lot of needs right now and for dried specifically We are focused on curated and compliant research data publishing So not just dropping your data that may be associated with an article or other But specifically that these are high quality data sets that are coming out and being published and that researchers are supporting So what does that all even look like? And we know best not to do a demo So I'm only showing some of the things from the new system, but we can of course talk after if you have specific questions So we're moving as John mentioned onto a new platform. This is dash This was the UC platform that we had come up with so most of these features are already ready So when we're talking about this new platform, we're not saying in two years from now We're saying we're ready to go and we we want to launch we want community support on this So this platform is core trust sealed. It's standards-based We are Discoverable schema.org and Google data set search. We're using orchid for login as well as institutional sign-ons using funder registries Having keywords and making everything discoverable in that way And then big things at the bottom here that I'm going to go through in more detail But I'm having features that allow for easier deposit of data sets both through the publisher and through your work environment Having institutional features that had not been there before that I'll go through And having an actual ways to report back on what's going on with the usage of that data set So another project that we're involved in is called make data count, which you may have heard of And that's a project about standardizing data usage metrics and citations And so the first implementation that can be seen is in this platform But the non-technical side of this is that dryad values curation and that's really what makes dryads stand out We know that institutions. We really care about curation as well And so what we're focusing on here is that curators expert curators are actually going through every submission that's going into dryad And in the past it's been these curators that work at dryad But we know a big part of this is that as institutions we Supervalue curation and we have data curators and data librarians that want to be involved in this And so we are also building for that ability for us to actually be a community It means that the data librarians at the institutions are involved in these endeavors and not leaving them out So there's four big stories here about what we're focusing on Seamless deposits. So one thing dryad has had is a good relationship with publishers and we want to advance that And so when we're talking about publisher integrations, we're saying API integrations where when a researcher is submitting to a journal They can automatically submit their data And we can send back the data citation to the journal and actually have that connection back and forth So this is something that scholar one is building which are major journals wiley taylor and francis made Maybe most On scholar one we're talking to aries and high wire and ojs And we're going at the platform level. So not at the publisher level. So this can be a switch That publishers can choose because we know that the more publishers you do this the more deposits We're going to see coming in easier for researchers. Do people know the names like that high wire scholar one jim so so, you know I would just stop and say, you know think about the idea of your Data repositories or your data strategies being, you know, would you ever even think that you could have a journal integration with Scholar one at your it's just not something that would even be able you would ever be able to do at the level of just your institution also Who could So think about who else in this space actually could do that Now most of the names that you're thinking are all companies. They're all Organizations that are trying to charge us lots of money for lots of stuff So it's like where do we all where are we all going to go and how we all going to work on trying to drive adoption and thinking about this level of activity Unless it's with a community run organization like dryad It's going to be with these commercial enterprises So we have to start thinking in our head like daniel said earlier Where are we going to be five years from now if we don't start thinking about how we can support community led initiatives like this So that's the focus on ease of deposit for researchers, but another big part is transparency and reporting for institutions Um, so another big part of this is hey researchers have been going again At probably every single one of you here that represent an institution researchers They have been sparing to dryad, but you uh don't have a look in right and that way to connect with dryad on that and so Uh here i'm going to show this is the new system This is transparency and and reporting that you can have into the system now So you would see anything that's coming in from your institution You can search by it you can filter and then of course you can download a csv of it So you could actually see what's coming out of your campus and who's even started a submission But not even click submit or who's started one, but it's in privacy for peer review And you could actually have your data librarians be working with them through their submission process And this has made possible through What daniella mentioned earlier around either orchid login or single sign-on login from your campus is single sign-on, right? And then another image here of visual provenance So not only seeing you know what in that reporting view, but what's actually happening with the data site What's going on with researchers at your institution? Where are they submitting to the journal because it'll also say what journal they're at And just kind of get that fuller picture of what's coming out And we also Wouldn't be putting grog in the comments or we might but Hopefully there be people's names not testing account as the name, but yes you get the picture A big piece of this of course being the usability and discoverability. This is where curation comes in So this is a dry data set that's been ported over Into the new system. Of course, it would have orchid attached to the authors But a big part of this is that we want to have a really prominent data citation So if we're trying to say best practices, let's focus on the actual citation to this data set And then of course because I mentioned make data count being involved, let's see who else is citing that data set Let's actually get a fuller picture of what the use of this data set is Going back That of course because it's not live that these and downloads being standardized as well So you can actually you know when you're pulling that report see oh, okay The status that has this impact is has this usage thinking about that way So just I mean just real quick around like another thing We as a community do a lot is we spend a lot of money on business intelligence information that should just be free Right So one of the things that this does and what we're talking about here is being able to work together to build A corpus of information about what's happening on your campus That doesn't cost hundreds of thousands of dollars Right So knowing what's going on with the data that's being published from your campus Knowing who is citing that knowing where that where that the reach of that research output That is something that if you think to yourself We're all spending way too much money and way too much time on and we don't work together on that problem And again another another side benefit of working in a community led organization or with the community Infrastructure project like dry And we also coming back to this if we don't act on this now and support the community I don't know how many of you also received emails in the last week about Paying about 5k for a similar report from springer nature But again, it's not that we don't support people that are supporting commercial services It's if we care about these community principles. We we got to think about this right now And then of course here when I say curators and curators are actually looking at the keywords They're actually seeing is it optimized for search? Are we actually referencing the article that's related to are we making this a usable data set? So looking at the files making sure that they actually Open and are interoperable and that there's enough metadata. Of course if you're a metadata librarian, you know that You would not be pleased with it just being an abstract But enough metadata to understand what the data set would make it usable And then the last piece of this is compliance. So We know that we as a community care about fair data And so that is again what makes dryad stand out is actually curating data To comply with what we consider as a community fair There's funder requirements. So having the core trust seal On dryad fair data making sure that we have proper preservation of the data that's coming in through here Having publisher policies so that you can actually keep it private during peer review But institutions would still have that view in And having it be proper data citations. So what are the best practices and how could dryad support that? And then I put your input here because for institutional values We want to be talking to you as a community and saying what is it that you value and that you would want in this system So we're adoption focused But if you if we're all spending time thinking about what is the best in this space We need the institutions to be involved in this place where the researchers are already going Yeah, so if we if we have a policy or a concept on on our campuses that says We want to have a copy of every data file that comes in Then we should do that That doesn't you know if if we say we want to make sure that There's a certain set of information that's put into data sets that are that are published. We should do that This is like Not again not something that would be done if we were doing it on our own or something that would be done if we were using a commercial enterprise You know a service But something that can be done if we're thinking about it like what are things that we as institutions value? What do we want? Let's build the community project that can actually do that so kind of Wrapping up what sets dried apart Is that we're focused on the user the goal is for curation and compliant research data publishing And that the features are actually driving adoption and upholding institutional values and bringing them in And so We our big focus, you know of doing this. I hope it's clear, you know what the story is but for also for us It's less about dryad being the only solution for research data But it's more that our researchers are already going there And so this really to us is saying we as a community should support them Even if it's not the one solution at your campus Yeah, so I mean one thing that we didn't bring up but a phrase that we've been using a lot during this is Is a right sized So one of the challenges that we have in the data publishing space is that we are being sold or being told that we need to have very bloated out of wrongly scoped Inappropriately scoped solutions for problems And we're being told that we should be paying or we should we should uh be charged Inappropriately sized price tags for services, you know why we're being asked to pay $5,000 to get a report For stuff that came from our university We're trying we're being paid we're being asked to pay $350 for someone to curate a data set when we have data curators on campuses We're being asked to spend hundreds of thousands of dollars for data repository support Or for someone to come in and or we ourselves are assuming that we must pay hundreds of thousands of dollars to support A data repository on campuses And so one of the things that we're talking about here is like what is the value? What is the monetary value of what we're putting into these things? And what are the what is the value that we're getting out of dryad right now? And thinking about what is the right sized approach of like what do we as an or as a community thinking That we should be you know paying into dryad. What's a what's an appropriate price tag for this? Is it a few thousand dollars a year? Is it Donating in time and energy to do data curation on the platform? Is it what what is that value? What is it that we as a community would would put back into the project? And so, you know, we are already an institutional community and we already support data curation and data publishing data preservation Um, and it's really this called action of banding together and so we at cdl have started to Band together with dryad and I think what we're talking about here is A call to action to the community to say we should start talking about how we can become members of dryad We should start signing up. We should start thinking about how we can be involved in that in the governance in the future of the organization Yeah, and you know relying back to what uh originally Um, I had written about and presented about last year was it's not about technology This we just showed some technology, but that's not really the focus in this It's not about which community area you're involved in and which technology do you stand by It's just that we need to meet researchers where they're at and we need a centralized approach to this we need to work together and figure this out because We already know researchers are going to go places even if we have no control over it And so having this central approach if we all agree here in the room that adoption is key for us Um, we got to work together. So that's our call to action And that's our presentation Really what we wanted to do is spend time to be able to have a question and answer session Like to be able to respond and talk about like, you know, the this is a conversation It called action for all of us So if there were questions or ideas or comments about What we're presenting up here would be great to just spend the rest of time on that in the back first So thanks everyone for coming and then come talk to us if you want to talk further about connecting. Yeah, thank you