 Welcome everybody. My name is Joel Herndon and I'm joined today with my colleagues Jake Carlson and Jonathan Petters. The three of us together are representing the Association of Research Libraries led realities of academic data sharing or RADS project as we'll call it during the rest of the presentation. Sponsored by the National Science Foundation in 2021, RADS brought together six research institutions with a shared interest in building research evidence-based models for research data sharing. This research takes place in which the policy landscape for research data management and sharing has grown increasingly complex. Over the last decade federal funders have become increasingly prescriptive about how federally funded research is the timing for its sharing as well as the details of how it is shared. The RADS project wanted to see how these changes are impacting both our research administration and services on our campuses as well as the research that our researchers are conducting. We had three primary research questions that we wanted to investigate as part of the RADS project, specifically how do our institutions and the staff working in research support at those institutions support research data management and sharing. Secondly, how are researchers preparing, staffing and implementing research data sharing as part of these new funding requirements and what does it look like when they do share their data as far as where it appears and the metadata quality associated with it. And finally, what is the institutional cost to implement these policies, both for the entire institution and for the individual researchers as a whole. Before we could answer these questions, we had three methodological considerations that we had to take into account. First of all, we are very different institutions as far as how we're staffed in the offices that we have to support research. And so we had to create a common framework that we could use to compare our different structures across these six universities and indeed other universities across the U.S. And so we made the decision to group our different service groups on campus into four large categories that captured a pretty wide range of research supported our institutions. We talked to libraries, both libraries and archives in the data sharing space, IT units, everything from central IT units down to very specific departmental and professional school IT groups and other different locations on campus. For research administration, we had a very broad definition. We looked at everything from the central office of research or the office of vice president or provost for research down to very specific research administration units looking at intellectual property as well as the campus institutional review boards and the leadership of those boards. We also talked to research centers and institutes on our respective campuses. These groups often don't have a campus wide focus, but they are involved in research data sharing. And we were very keen to understand how they were supporting particular disciplines as well as particular subject areas at our different universities. Our second consideration that we had to look at very closely and we consulted with quite a few groups on this was trying to craft a common set of data sharing activities so that we could have a common vocabulary both for the grant and for talking with our research administrators and researchers on all our campuses. I'm not going to go into a lot of detail here in the methodology. We'll talk a little bit in the results about some of those activities, but I will give an announcement that there will be a publication from ARL in January that you can look at all these activities and use them if they are helpful on your campus for mapping out how this data support is happening. Finally, we did a great deal of work as well building a model for estimating data sharing cost. At most universities there's a fairly good understanding of how much research expenses are for grant funded research, but data sharing is fairly new. And so it required a lot of subtlety and nuance and trying to tease out what are these additional costs both for researchers and for the research administration. And we have an additional publication coming from ARL in January that you can see that Specifot gives the details of that model and some of these expenses. So our survey, at least two did we sample. As mentioned earlier, we had two large groups, the research administration structures at our university and researchers. For both groups we conducted an online survey. We had about two months that individuals could complete that survey and we followed up with individuals that were willing to join us for a qualitative interview online. For our administrators, the six institutions identified roughly 140 people and we had about a 50% response rate covering all of those four areas that I mentioned earlier. We found that very helpful, both the survey results and talking with them in detail to get the context of their answers for really giving us an idea of how universities across the U.S. are approaching these questions. For researchers we took a slightly different approach. We actually looked at all of the funded research since the Holdren memo requiring data sharing at our universities in three different funding agencies, the NIH, NSF and Department of Energy. We picked specifically five subject areas that were present at all of our universities and had about an 8.4% response rate but with about 3,500 PIs involved in that study that translated into hundreds of responses and quite a few interviews in person. So once again we had a very rich set of information across multiple disciplines about what data sharing is looking like in North America today. With that I will let John talk a little bit about the results. Alright, thank you Joel for getting us started there. Some results, a very small amount of our results will be more available in the report but to give you a little feel for the kind of information we found. Okay, so Joel had already mentioned these data management and sharing activities. We had 27 of them that we categorized through five different phases of research. And just through here if we talk about research life cycle that has a start and an end if it does have a start and an end. Going from a planning design start up of projects, data collection storage and management where researchers are trying to answer the research questions they've set out for. Making the data broadly available as results come in and as the project nears its conclusion. Data retention including preservation, archive and long term access to that data for a longer term period. And a project closed out in compliance where the research areas and their institutions need to make sure that the project is met all its requirements from the sponsor and the sponsor is happy. So again we do have the first version of this is available online already and a later version where we've done some revisions with more community input for these activities will be available soon. So the phases of research we'll talk a little bit more about in the results here. Keeping in mind there's more activities as we go. So for researchers, we asked researchers to report to us in our survey the activities that they did whether they did them themselves, within their research team, the research lab. Did they do them with institutional assistance somewhere on campus? Did they do them with external assistance somewhere off campus, maybe up here in another institution? Or did they just not do them at all? Under not do, there could be two kinds of main categories. Either they didn't do it because it wasn't applicable to the research they're doing or it's something that maybe they should have been doing but they didn't do. We did not distinguish or discriminate between those two. So that's the kind of information we got from researchers, for the institutions, for the administrators and the support units. As Joel already mentioned, these are the four categories of support units we surveyed and we just more asked them, do you support this activity or not? Yes or no? It's the Office of Research, Research Institutes, Research Libraries and Information Technology offices. So to share a little bit of the results for our researchers. So we're looking at what we're looking here at the top five activities that the researchers said that they were doing themselves and the percentage of the respondents that said that they were doing themselves. Just the top five, we got 27 activities altogether. So for all these activities, we have 90% or more researchers saying we mostly took care of this themselves. So making decisions about what data to share or host, preparing the data for sharing, creating quality control mechanisms or procedures, developing their own documentation, documentation of the data, and monitoring integrity of preserved data. So I'd like to point out two things about these top five activities. One is some of these activities that we're seeing as the top five are clearly ones where there are some support around the institution for some of these activities. Preparing data for sharing is something research libraries have services for, at least at our institutions. But some of these activities also may be ones where the researcher really should be the ones who are in the driver's seat for it. So developing documentation of data, the researchers created their data, they're collected their data, they understand it the best. So there may be other people who could help them, but they should be the ones who are primarily documenting their data. So a couple of things we see there. We talk about top five activities that the researchers are doing with institutional assistance. We see developing materials transfer data use agreements, ensuring data security were appropriate around HIPAA or export control, things like that. Determining intellectual property and copyright considerations, evaluating data security needs, and preparing IRB protocols and informed consent. So a couple of things we might say about these kind of results we can see are generally these top five activities are things that have some sort of legal concerns around the research, and so researchers are seeking out help in these cases. We also see that the percentages are quite much lower than the ones for data by myself, right? We're arranging 64% to 25%. But also we need to be careful, of course, about interpreting our results. So I look at down here, evaluating data security needs and only 26% of the researchers are evaluating data security needs. That might make us immediately worried, what is the other 74% of them doing? But maybe some of these people are doing research that doesn't really have some sort of data security concern. Like a lot of environmental science, the data is rather nonsensitive in many cases. So careful about over interpreting what we're looking at. Okay, so let's move on to a little bit about our support services. So first I'm going to show you a graphic that's too small and you can't read it, and I should annoy you a little bit, and that's okay. The main point of this is to demonstrate we have very detailed information here, but I couldn't possibly go through all of it in the time allotted. But it's all available, we have wonderful tableau visualizations one of our colleagues generated. But at a high level what a lot of libraries said that they're doing. Providing support for public access to research data among the following phases. So primarily libraries are providing support. Our libraries are providing support around planning, design, and start-up of projects. So it might be mostly around data management planning services we have established. Making data broadly available, data retention, long-term access. These things are around institutional repository services that we have set up. Helping find other repositories for researchers to deposit in. Developing support within data collection storage and management around these operational aspects of the research. Some of us are slowly standing up some services and support for researchers. At Virginia Tech we have this informatics consultant who can help researchers with computational workflows and survey design, things like that. Less support provided and project closed out in compliance though. And that tends to be the purview of the office of research where they tend to be more responsible for the close out and ensuring the requirements have been met. We can look at similar information for information technology offices. And for the ones for the offices, the categories that provide support across campus, not the research institutions centers. IT office has typically provided the broadest level of support across all these phases. So all four phases that we talked about already except again for project closed out in compliance, which again is typically the office of researchers purview. But IT office has provided support in all of these things and a variety of activities across these several of these phases. I believe that's why I want to leave here. We have takeaways and Jake can talk about more of our conclusions of things that we've seen from our work. All right. Thank you, John. So as John said, excuse me, overall researchers are doing the majority of data sharing activities on their own. And that's not necessarily surprising or a bad thing as John explained. Researchers have always had to manage their data and share it with their colleagues who they're doing the work with. The shift really comes from having to prepare it for others to find access, interoperate and reuse at a meeting the fair principles. We didn't hear in our interviews researchers expressing some confusion, some frustration around understanding the requirements, mostly around how they translate and affect them themselves in their labs. So how do they take these fairly nebulously defined requirements that research funding agencies are putting out and translate them into operational activities that they themselves can do and feel comfortable and confident that they are in fact meeting those requirements. I think something else to consider is that administration and service units were also still adjusting to these requirements from data funding agencies for funding agencies rather. Many of the services and support provided by the institution are focused on the needs of the institution over the needs of the researchers themselves in terms of looking at it from the institutional perspective of how to ensure compliance is happening, how to focus on minimizing risk and other sort of things that are more institutional perspective. It's not that they're not providing services to the researcher themselves, but it's not their primary mission or reason to exist. Library services and IT services to some extent are more generally focused on the needs of the researcher, but researchers are still performing many activities or indicating they're performing many activities themselves rather than taking advantage of what the library has to offer. And there may be a couple of different reasons for this. They may not know about our services, so outreach I think is an ongoing challenge for libraries and other service providing units. They may not believe that libraries have the necessary depth of expertise to really get into their particular situations in managing their data and sharing their data. There may be questions around our capacity to be able to address these fairly complex and fairly sophisticated issues, particularly if we're working with more than one researcher at a time. And we may have empowered researchers to be self-sufficient in some cases. If you look at data management plans, for example, if a researcher works with a librarian to put forth a data management plan for one project, they're probably using that data management plan as a template in carrying that forward for subsequent projects. We do see a couple of opportunities looking at our data for developing more and better connections between service units and researchers. For IT departments, John talked about data security services. 49% of researchers said that they do not have security concerns, and that's maybe due to their lack of human-focused or sensitive data. But even for the 51% who did say that they have those concerns, only 40% of those spoke with IT, as John mentioned. So there's clearly, I think, room to potentially develop a stronger relationship between researchers and IT units. 90% of researchers in our survey did not create quality control mechanisms for their data. And there, again, that really is, I think, the purview with a researcher to decide to do or not to do. But in situations where they are using technology-intensive platforms or really need sophisticated workflows, really having that connection with IT may help advance their cause and produce a better data set as a result to share. For central research offices, we heard 87% of researchers saying that they ensure the funding agency requirements are met themselves without talking to other folks. And that obviously is an area of concern as enforcement becomes more stringent, as requirements become more specific. I think we really see an imperative for central research offices and researchers to connect to figure out how do we, in fact, demonstrate compliance in a way that we're all comfortable and confident with. For research institutes and specialized centers, as John mentioned, there is a variety of different services that they offer. Some of them are pretty holistic. They offer services across the data lifecycle. However, research institutes may not be offering their services across the institution, but might they serve as a model for the institution in considering what kinds of services to offer and how to connect with researchers to offer them. We also thought that maybe research centers could be labs or pilot particular kinds of data services, and if they resonated with success, could that be scaled up to the institution as well. At libraries, as we all know, offer a variety of services to researchers, but really thinking about how do we help researchers connect their individual dataset to the larger community of practice as being a particularly rich area for us to invest in? So thinking about how do we assist them with making decisions on how to put their data out there in ways that will resonate with the community, particularly when their datasets don't fit neatly into what a repository might offer because it's too large or the formats are not in alignment with what the repository can support. How do they navigate that particular field then and satisfy the requirements effectively? Thinking about selecting and applying licensees for reuse, so what can researchers do with the research data once they have it as an area of concern for researchers and a place where libraries might be able to help direct them, and in adopting PIDs, if we discussed at CNI and other places, as really the glue to hold all of this together and to have research data count as scholarly object of first-class order. We also saw opportunities for cross-campus collaboration. So there are a number of activities that really span or might require expertise beyond what one unit could provide in their themselves. So for example, in developing recommendations, policies, and practices for deaccessioning data, only 50% of the IT units that we spoke to in the survey offered that particular service. But even for those who do, we think having involvement of libraries, of research offices, and even research centers might produce stronger policies and make for better practices. Only 50% of researchers did identify and budget for the cost of data management and sharing, and given the wide variety of potential costs and expertise across the university, here too, working together might be more effective in to make this more normative part of research practice. And then finally, training and education. As you've mentioned, outreach is particularly challenging. We all tend to offer different training sessions, but we tend to do it in isolation from each other. If we were able to connect and perhaps offer more holistic training, might that help researchers better understand the larger suite of services that they have available to the university and to connect with them more effectively. We have a number of models, emerging models of institutional cooperation on data. I'm not going to go into each of them due to time, but I want to give a special shout out to Cornell. Wendy Kozlowski, unfortunately, couldn't be with us today. She is a part of the RADS project, and I think her work and Cornell's work really is a model for us to think about in making these kinds of connections across the university to provide more holistic services. I want to stop at a case study potentially. I just joined the university at Buffalo four months ago, and I'm already seeing opportunities to apply what we've learned in the RADS data to my new position at Buffalo. So prior to my getting to Buffalo, but earlier in the year, Buffalo formed an institutional-wide working group on data sharing, looking at things like our infrastructure and our storage of support, graduate student education and what they need to know to manage and share data effectively, and developing a data repository and associated services. This culminated in the proposal to the provost, which we're still waiting to hear back from. But once we do hear back, I think taking what we've learned from RADS from six institutions across different universities and applying that locally might help us to ask better questions and consider where specifically we might want to make investments to advance what we're doing at Buffalo. And then in addition, we're also working with the Sponsor Program Services Reorganization. They're trying to break up what they've been doing to make it less centralized and more connected to their academic departments to learn more about what the researchers are doing and what their needs are for data management and sharing as well as other aspects of getting a grant and working it through. They do offer training as well on things like ORCID and SCI and CV, and we are working with them to do some co-training where it's appropriate to do so. But there too, as they think about what they're trying to accomplish and the models they want to use, having information from a much larger survey of the RADS survey can help inform local practice. And I'm really pleased to say that University at Buffalo will be a part of RADS too, which was recently funded by the IMLS to refine our instruments and to run them again and to develop a toolkit that others can use and make use of. So thank you IMLS for that funding. I want to close by recognizing that this work was done by more than just John Dole and myself. We have a lot of people involved. I particularly want to give a shout out to Shauna Taylor, our project manager who kept us productive on a lot of complex issues. Cynthia Hudson Vitale is our PI, she has questions on this project. She's the one to direct them to you. And as John mentioned, we do have a website containing information about the RADS project as well as a number of reports that we're coming out pretty soon, many of which in January. And so with that, I want to open this up for questions. Thank you. In the follow-up interviews to your survey, which I assume they were after the survey, were you able to determine from the faculty researchers when they said, I did this myself, how much graduate students were involved in this work. And that also relates to the question of training and workshops and, you know, might that have been under-reported, the library and IT impact on offering training to graduate students who then had this role rather than the actual researcher. Is this on? Yes, okay, good. Yeah, I think that's a really insightful question. My assumption when researchers said that they did this themselves, they didn't mean just themselves as an individual human being, but their lab, their graduate students, their partners were the ones to do it themselves. That said, I think we spent a lot of time in the interviews trying to clarify what we meant by certain activities and what we meant by certain costs. There was one of the things we want to do with RADS too, is to take what we've learned from those interactions and those interviews to make them more clear. And I can certainly see, you know, training in around graduate students to, again, sort of make this more of a normative part of research and to better, I think, plant the seeds for culture change around what kinds of services are provided and how they're used. John or Jill, do you have anything to add to that? I think I... Is this okay? I think I'd like to add one more methodological note that I mentioned into, well, I mentioned one aspect of it, but I want to be clear when we talk to researchers, this is a perspective over the last ten years. So funded researchers over that time period and the information we're receiving are things people might have been doing eight years ago or six years ago and so when you hear these researcher results and reporting out, this is what they were doing at the time, when you're hearing the information about the research support units on campus, that perspective is within the last year, so that is fairly current information. So it's a little hard to tell, have the structures changed. We do know that we've done quite a bit in the libraries, which we're representing, quite a bit of training for grad students and I do think that's made quite an impact. We've heard in many of the interviews that grad students play a core role in the data management for different labs on campus and I do think that's an aspect in grads too, we'll be looking at even more closely about how much of an impact that was often listed as a cost element as far as funding students as part of the study, so something we're very keen on understanding better. It is really hard to see from the stage so I don't know if there's anybody with a question. I don't see anybody. I think we have a question for the audience. I think many institutions are already asking these questions about how you're supporting research data sharing and management on campus but I can say for the six institutions that are part of the study, the grant was really a catalyst for talking with individuals that we might not as this pattern others are seeing where you're having opportunities to talk for example to finance officials on your campus about how are we funding these types of activities in the libraries and IT and central research or how can we work together across these different units whether it's the IT unit in the libraries the IT unit in the office of research, has this changes in the funders policy triggered new discussions or created new opportunities for partnerships? Yeah. Yes. Sorry. It's scouting over here if you can't see me. Sorry, you weren't using the mic so I didn't know you were talking. But I have the mic now so I'm going to talk. Yes. A lot of people at University of Nebraska-Lincoln started getting worried last summer about the advance of NIH changes and enough people we were already I had just been there a year but enough conversations were sort of already happening between libraries in the office of research that really set the stage but because of that these conversations really came to the fore quickly and meant that everybody in the office of research in the sponsored projects and so on and libraries were talking about this also with IT and high performance computing and so we did a lot in preparation for January 25th it was great and then everyone seemed to calm down so but yes it did cause the requirements helped bring these conversations forward faster I think. Thanks. Yeah, so sorry, Steve Wiede from Yale University sort of in the beginnings of really thinking about our research data in a holistic way but some of the partnerships that we have started to build we had a strong partnership with the central IT department for example as they had brought up the proof of concept and then there was a quite long gradient of support from the proof of concept going over to our production system at the library which took it over and the other partnership we had was with the data intensive social sciences center and the reason for this was we lacked a lot of expertise in the exact technology stack that was being used for data management and we needed partners outside the library who did have that knowledge to act as subject matter experts which normally the subject matter experts in the library stuff are in the library because it's library software so this was kind of new for us and actually having a named partner stakeholder product owner outside of the library was a really unique experience. Yeah, awesome. That's great. Time for one more? I think so. So quick question. Listen to the RAD study. I'm wondering how you see or if you see or if I am completely misinformed and mistaken in seeing complementary nature of this study which is focused on large research universities and the study that Ithaca is doing with that broader cohort of R1s, R2s and others that we heard briefly the lightning around last week. We're part of the Ithaca study and so I'm wondering do you see these complementing one another? Are there any cross units, people who are part of RADS and also part of the Ithaca study or are these really two separate things and I'm simply not smart enough to see that? I don't know that anybody is a part of the Ithaca study who is part of the RADS group of areas of overlap and complementary intent. The RADS study I think is a snapshot it's trying to capture as much as we can about people's understanding of what was done retrospectively. I think as Joel said we're trying to shift more proactively with RADS too as to are there things that we can help sort of set up going forward and to the extent that I know the Ithaca study it seems like that's sort of a common area of interest and mission going forward so yes I think the more that we have these kinds of studies and we're seeing different types of information come up the better we can sort of amalgamate and get a sense of what's really going on here so it makes perfect sense to me that we have multiple areas of interest in this space. It looks like we're out of time so if there are no more questions thank you all for coming.