 Okay, next up on our agenda is a presentation from Kerry Walinetz and Eric, I believe you're going to handle. I am delighted to introduce my good friend and colleague Kerry Walinetz, who's here to make a presentation I think will be of great interest to this council. Let me give a bit of an introduction some of you may know Kerry, but for those of you who don't know her background Kerry is actually the acting chief of staff. As well as being the associate director for science policy and director of the office of science policy or called OSP at the National Institute of Health. And so as leader of the office science policy she advises the NIH director on science policy matters of significance to the agency to the research community to the public on a wide range of issues including human subjects protection, biosecurity, emerging biotechnologies, data sharing of relevance to this council certainly, regenerative medicine, the organization of management of NIH and the innovative policies that relate to NIH funded research. We're very fortunate to have Kerry in this position. Actually, I was a member of the search committee. I might have been a co-chair of the search committee that brought her here. Prior to that she was at the Association of American Universities and also had experience of Federation of American Society of Experimental Biology and also a role of the United for Medical Research. I will also tell you I work quite closely with Kerry in particular because she and I co-chair one of the working groups that's part of the governance system at NIH. It's the working group related. It's called the Data Science Policy Council. It basically deals with policy issues related to data science, something of growing interest and complexity. And so as co-chairs together for a number of years we have moved that group through many, many, many different topics related to policies related to data science, one of which was shepherding through a new NIH policy on data management and sharing. And that is what she's here to talk to you about. I know there was interest in this from some council members and Kerry I know has been coming around to councils and other groups giving an update about this new upcoming policy. And so with that, I think I will turn this over to Kerry. Great. Thank you so much for the warm welcome, Eric. And as you say, this is an especially meaningful presentation because you've been such a tremendous partner in helping all of this to happen. I really, I appreciate your wise counsel on this. So I'm here to talk a little bit about our final policy for data management and sharing and hopefully we'll have lots of time for questions as well. It's really wonderful to have this opportunity to speak to you all, because as you'll learn, we're in a period in which we have released the final policy, but we've left a very long glide path implementation. So this is the time when we're really making sure of all of our ducks in a row to allow for implementation of this policy. And we're really listening hard to the community about what they need in terms of guidance or training or other resources to make sure we're all prepared to move forward. Next slide, please. So I know I'm preaching to the choir here when I talk about the importance of data stewardship and data sharing since of course the genomics community has long been on the leading edge of these topics. From the agency point of view, we really see very good data management and sharing practices to be important to our rigor and reproducibility efforts. We see it as enabling validation of research results. It's important for allowing access to high value data sets, accelerating science at the end of the day. This is all really about facilitating the science that we fund and to increase potential opportunities for collaboration. There's also an important transparency measure here and this is something we hear quite frequently from Congress and patient groups in the public, but there's a hunger for access to for publicly funded research results and data. It does help foster transparency and accountability. It's also a great role as stewardship over taxpayer dollars pretty pretty seriously. It also can potentially maximize research participants contributions one of the things we hear a lot from participants is that if they're going to take the time to volunteer for research studies they want to make sure that the output of the studies is maximally utilizable that it's not hidden in a black somewhere hopefully ultimately it's published. And it also gives us some support to facilitate appropriate protections of research participants data. Next please. So the quick summary the nutshell version of the policy is it's really a policy that requires submission of a data management and sharing plan for all NIH funded research. So this is not a data sharing policy that says thou shalt share thy data this particular way, because this encompasses all NIH funded research. So it's got a large scope. We wanted to make sure that there was some input from investigators on the details of how where and when they are going to manage and share their data as appropriate. The sort of carrot and stick model built in here of course is that once a plan is submitted it will be ultimately part of the terms and conditions of award and will expect compliance with the plan that's been improved by the funding Institute and whether or not you are doing what you promise may in fact affect your future ability to receive funding from NIH so there is a bit of a hook here to make sure that people follow through on their management and sharing plans. As I mentioned there is a long lead time here this policy it will not be effective until applications that come in in January of 2023 and essentially replaces the 2003 data sharing policy which was much more limited in scope to a narrow set of large awards. We've already released some supplemental information guidance to assist implementation and compliance with the policy. And as I said we're in the process of doing exactly this sort of outreach so that we can get feedback from the community on what further guidance or tools or resources might be necessary to help. And as I already mentioned the ultimate aim here is to foster data stewardship. Next please. So this plan has been a long time in coming both internally at NIH and externally. This has been an iterative process through a number of years we've repeatedly sought public comments stakeholder comments we've done a lot of engagement over the years we've had RFIs we released a draft policy. We have done specific tribal consultations acknowledging that the American Indian and Alaska Native population have specific interest in data sharing, and you can actually read the results of that that tribal consultation on the OSP website. We've heard from other government agencies, federal advisory bodies like the secretary's advisory committee for human research protection. So we received a tremendous amount of input throughout the course of the developing this policy, and that has been extremely useful for refining clarifying and hopefully coming out with a final policy as responsive to a lot of that feedback. I have to say, I joke, but it's true, since the policy has come out, I think I've heard from about half the people who say it didn't go far enough and half the people who say it went too far which makes me think that we might have gotten it exactly right in terms of heading that needle. Next please. So the devil of course is in the details and although this isn't a very detailed policy. There are some important things that are worth noting the scope. So this really applies to all NIH supported research generating scientific data so we're not really talking training awards here for example. And there's more detail in the policy about what that means, but you know just to clarify this doesn't mean we want you to send your lab notebooks or every scrap of paper this is really the research underlying the ability to replicate the findings of research itself. And although there is not a requirement for sharing there is an expectation that sharing is going to be the default practice, unless there is some very compelling reason why you could not share recognizing there may be for example ethical or legal restrictions or technical issues that may present various the sharing by and large. We want to make sure that particularly when human participants are concerned that the plans data sharing is responsibly implemented. So we continue to expect you to be cognizant of privacy protections rights, confidentiality, all the rules that are in place to human participants and research. And we also include some information about timeliness. So we expect data sharing no later than the publication of that data and of course, these days, a lot of data sharing takes place through publication because journals are not just publications for data sharing, or by the end of the award if we're talking about unpublished data so that's an important key point here is this is not just data related to publication, but really all of the data resulting from the award itself, even if it's unpublished. Next slide please. This is just a quick overview of the process of what this looks like relative to the application process. We expect the data management and sharing plan to be submitted at the time of application in the budget justification section and we're in the process of updating all of our application forms during this phase then implementation period. This was a deliberate choice in response to a lot of feedback we got in earlier iterations of the policy. One of the things that we heard quite strongly is that it is helpful to submit at the time of application because it forces both investigators and institutions to think through in parallel with developing the research, what the, prospectively what the plans for data management and sharing are going to be which helps you budget we recognize that data sharing and management if done well is not without cost and we want to make sure that investigators are taking those costs into account and thinking about this up front so that it's not something that's just added on at the end sort of retroactively. Assessment of the plans is really going to be taking place at the programmatic level with NIH program staff and we're in the process of developing guidance for program staff to make sure this is done consistently across the agency. Peer reviewers will of course see this but and can comment on it but it's not a scored part of the proposal during review and importantly, plans can be updated so you know we don't we recognize that science evolves throughout the course of the research process that's what we expect and and we give our investigators a lot of latitude to course correct, depending on where the science goes, and we want to make sure that investigators have the ability to update plans accordingly. And so, we're working through with our office of extramural research what exactly that looks like. As I mentioned, ultimately, we're going to expect compliance with the plan we expect investigators to follow through with what they promised to do. And so plans will be incorporated into the terms and conditions and monitored again with the ability to update as appropriate and whether or not you've been compliance with your plan may be taken into account for future funding. Next. So I mentioned we released some supplemental information already and we're likely going to release more. This is general guidance on some ancillary issues related to the policy. The what we heard the most in public comments where people wanted additional guidance was related to allowable costs and so this is our initial guidance may very well release more detail based on some of the feedback we're getting through outreach. So we wanted to make sure that there was information out there about what could be built into budgets for reasonable costs, whether that's curating data and developing supporting documentation. We also released some additional information about repository selection. So the policy does not have built into it and insistence that you must use any particular repository or that you must use a curating data. Next week. We also released some additional information about repository selection. So the policy does not have built into it and insistence that you must use any particular repository or that you must use a currently established repository we recognize that the entirety of the NIH research portfolio is diverse and there may not be established repositories for every field which we fund. We strongly encourage the use of established repositories and we also provide guidance to investigators to help them identify data repositories along the lines of best practices. The Office of Science and Technology Policy at the White House has been really focused on this as well to try to sort of raise the tide of high quality repositories. The final data management and sharing policy represents a floor on which institutes and centers might perhaps build in additional specificity. And so individual institutes or programs or on an FOA basis might designate a specific data repository. But again, a specific repository is not built into the overall policy. So, as I mentioned, what's next. We're doing this now talking to all of you. First of all to make sure that everyone is aware of the policy and the timeline for implementation. This seems like a really great opportunity to again hear from the community about whether there's lack of clarity or additional need for guidance or resources or training materials. We're working with the National Academies of Sciences on some additional information as well, particularly related to costs of forecasting costs for data management and sharing, which again is something we heard a lot about in our policy iterations. We're working to develop tools and approaches for incentivizing good data sharing practices, recognizing that different communities are all over the map in terms of how familiar they are with some of these practices again. It's nice to talk to the genomics community because you're way out on the leading edge, and we're learning a lot from that and clarifying the interactions with other NIH, a wide policy is like the genomic data sharing policy can be more about no sex and program specific. So we've got a long glide path here, but 2023 will be here because we know it. So this is your chance to get in on the ground floor and provide us with feedback on what would be helpful. Next please. I did want to mention because I know it's a particular interest to this audience, the interaction with the genomic data sharing policy so we're working quite closely with the Office of Extramural Research, who, as well as stakeholders with interest in genomic data make sure that we are not hitting the community with over lapping burdensome requirements here so we are trying to harmonize the mechanism and the approaches. We're doing this in parallel with streamlining of implementation of the genomic data sharing policy in general that is going to be centralized in our Office of Extramural Research. So hopefully collectively this will help to reduce the burden that we know exists with the GDS policy, also not create an additional duplicative burden as folks align these. So in state tunes, there's a lot of details to be worked out here, but we will be producing more guidance and information. Next please. And again, thank you for everyone who took the time to give us feedback on the policy as we went along. These are all of the links and you are welcome to the slides of course. And with that I'm happy to answer any questions and I again appreciate the opportunity to come talk to you. Rudy, do you want me to take, do you want me to moderate? It took me a while to find the mute button. Are there, thank you very much, Kerry. Are there questions for Dr. Walnuts? Oh, I see Mark, then Howard, then Sharon. Mark, go ahead. Thank you for the very informative presentation Kerry. So I'm wondering about the scope here of what's considered scientific data. So presumably this means data coming out of a wet lab but what about data that's generated through some computation. And kind of related to that I'm curious about your thoughts about expanding the scope out to include all kinds of digital artifacts like software as well. So that's a great question. You know, it is, and we got a lot of feedback on this in the policy as well. The scope does go beyond wet lab data into at least some of the preliminary computation analysis, you know essentially the results that we see out of our federally funded research. It's not meant to incorporate, you know, sort of the preliminary analysis, you know, your initial scribblings. It does not include software but we do want to see metadata associated with it. So we do include metadata as part of the scope that we're here. There are a lot of conversations going on, I will say, and I should have said this during the presentation and I'm sorry about this intersection between infrastructure and policies so we want to make sure that these are moving forward hand in hand. And some of those discussions are discussing how do we make sure that we are incorporating conversations going on about expectations for things like software sharing with data sharing and how do we do that in ways where we make sure that we've got infrastructure underneath to be able to support all of that. So I think it is an evolutionary process we're learning as we go and it's going to be informed a lot about some of the best practices that are happening out there in the community. Thank you. Okay, Howard. Thank you for your effort on this important area. I wonder if you could say a few words about open access publishing and the linkage between data sharing and an open access. Obviously the user are going to need the details, descriptions of methods to use any data. And so what are your thoughts or NIH thought on basically needed open access and plan as and other efforts moving forward in this regard. Yeah, thank you for that so we are watching that space very carefully you know NIH for a long time was sort of out way ahead on public access. And in terms of our expectation that all the publications resulting from NIH funded research we expect them to end up in PubMed and arguably as more and more journals have really created data sharing and management expectations again tied to the methodology. So that more and more the data associated with publications is becoming more and more accessible. We have no current specific plans to change our policy but as I say we're watching this closely you know we're hearing a lot about that from the publishing community because that of course is rapidly changing spaces they're responding to things like plan S and other pressures to increase the timeliness and availability of research. I would say it's a bit of a dynamic landscape to say the least. And so we are trying to make sure again that we're moving as much as possible in parallel with that as opposed to, you know, in obstruction with it. We're talking a lot with the publishing community about this policy as well and thinking about how we align our efforts in ways so that we are, you know, kind of increasing the availability of the results of our research overall. So, yeah, there is definitely we're very aware and paying a lot of attention to the intersection. Okay, Sharon and then Steve, and then Jeff. Thank you for the presentation that was really helpful. I do want to go back to the laboratory side, I, I think this is going to be a huge sea change. And particularly when you commented that data that wasn't published that came out of a grant. And yes, there are some existing resources but I have no idea what proportion of sort of wet bench work, terminally like but laboratory bench work are currently using those and I think it's going to take some fairly intensive training workshops and really working also with graduate education, because I think it will be a major change and how people currently handle their day to day laboratory experiments. And I was a little surprised that you didn't talk a bit more about that aspect two years is not that far away from that perspective. So, a great point and I will say one of the, what I really think is a good news story about all of this because it has been a very iterative process, and because we do have this long implementation window. There's a lot of sort of organic community efforts to really focus on this so, for example, you know we've talked with facib we've talked with the council and government relations coger, there are a lot of self organized efforts to make sure that we are thinking through collectively what all of that resource intensive effort looks like and some of the things that I know you know that are coming up in those discussions are. How do we deal with the variability of the. So some fields, some labs are very sophisticated in this space and other labs are not and how do we make sure that we're bringing up everybody to sort of a consistent level is something we're talking a lot with institutions and scientific societies about. That that we expect to see you know we are not expecting to be perfect on day one, I do think that there's going to be a lot of fields variability I think that the advantage of having program officers involved in the assessment here is they know their their applicant pool pretty well and are going to have some feel for you know what's appropriate in that given area of focus and so that will help account for some of the variability and what I expect will happen over time is. Well that first step may be a doozy for a lot of people I expect over time we'll see more familiarity with this and we are not expecting. You know to saber rattle here and say oh my gosh you know the look at this this genomicist is doing a fantastic job you know hey you know you. Dr epidemiologist why aren't you doing such a great job, I expect that we are are going to be able to. You know, take accommodate those differences in real time and are hoping that we can continue to work really closely with the community to develop those training resources, some of which I think are going to have to be pretty tailored because again there's a lot of variability. Between institutions between fields between labs and we want to make sure we take all of it. Steve go ahead. Yeah carry thanks for a great presentation. Yeah, you've really touched on a lot of things that really take place what in 2023 or something going forward. There's a lot of genetic and genomic data published before that, and we've been trying to through putting things on a pretty open site on DB gap publish results and summary statistics primarily with metadata. And it's really clear that there's really no support for doing that per se. And when you go to a study that had published an important paper and you say oh we'd like to correct gather all the files that you've produced on your, your summary analyses. Well the person who put those together have been gone for three years and trying to figure out where they are and then once you've put it together, providing the metadata for for putting it on a DB gap open website becomes a bit of a challenge. Do you see that as a next frontier of trying to go into genomic data and that's been published already and provides a really great resource if you can get access to it and put that in your system so that it becomes much more easy for investigators to use, and therefore enhance their own research. Honestly, there's so many next frontiers it almost feels like an endless frontier in a lot of ways. Yeah, I think one of the challenges with data management and access to data overall and and Eric has been even more deeply involved in some of these conversations is trying to simultaneously do this prospectively ends retroactively, which is a little bit like running on a treadmill while trying to, you know, do fine print embroidery. It's a huge challenge. So I, I hope so I think is sort of the bottom line answer and and and that is some of the, you know, we are thinking a lot about, as I mentioned, genomic data sharing in parallel with this and are some of the types of questions that are coming up is how do we know what we already have, you know, and that is even beyond genomic. I mean, I think there is a wide conversation about existing data resources and how do we, we improve access to them, you know, are there opportunities for economies of scale are there opportunities for, you know, those sort of radioactive improvements how do we make them operable. We have lost sight of any of these. It's just a matter of tackling them all in the band that we have. So that has been part of the conversation for sure. But one of a sort of series of next potential. Yeah, Jeff Jonathan but Aaron I see you've turned on your camera did you want to speak to this point or wait in the queue. I did Rudy just a quick follow up to Steve question and each year I just on Friday released a guide notice. Outlining our expectations for sharing quality metadata and phenotypic data. Maybe I'll put the link in the chat but essentially we recognize that exact point, Steve, and we're encouraging strongly encouraging our community to do better with sharing the metadata and phenotypic data. We have within the notice we've got a link to some s a qs to provide further support and we're also working with our program directors to help come up with approaches to increase the communication about this particular point. Thank you. Okay Jeff, go ahead. Yeah, thanks very for all the work for buddy on this issue. I'm interested in the reports. Attention to social and behavioral research data sets. A lot of those are going to be surveys interview data videos. A lot of that gets fairly granular. So, sort of general question about that domain. And I guess it seems like a motive for any principle here is external recipients of the data should be able to reanalyze that in a way to question the conclusions drawn by the primary researchers. Is that a general principle and is that if that holds then presumably the data sets are going to have to be fairly granular in a lot of circumstances. You know the scope and just to read it very specifically is recorded factual material commonly accepted as necessary to validate and replicates the research results so you know depending on the science that could potentially be fairly granular. We have not in the in the development of the policy focus specifically on social and behavioral sciences, although I am sure my colleagues in the social behavioral science community have been have been focused on that and this is where, again, I think we need to work, not only within NIH, for example with our office of behavioral and social science research, but also with the disciplinary groups in the community to make sure they are thinking through what this looks like from those specific fields I mean there's a lot of deliberate flexibility in the policy this is why we decided to go with a you know with a plan and and we have included guidance on what the general elements of a plan look like for our expectation and I don't mean that expectation in the grant term but in what we predict will happen is that there's going to be a lot of engagement with a disciplinary societies or areas of research practice to help think through what is an ideal plan look like for that particular area of research and I you know I think behavioral science is an example of an area where there's a lot of opportunity for development of best practices in in that space. And so, part of that is thinking through, you know what is the level of granularity needed to meet that bar of the ability to validate research. Jonathan. Thanks for that and I'm sure you know this is not an easy thing to work through so I appreciate all the effort that you have gone, you and the whole group everybody has gone through. So I've got a bit more of a I suppose a more granular logistical question. So this relates somewhat back to what Steve was asking in terms of retrospective data, you know, I now need to go back three or four or five years and try to generate more, you know, get pull that data together to try to improve the metadata that goes with it or I have a grant. It's over. And I have unpublished data, and there's no repository for me to put that into. But it you want it you want it out there. So there's a cost involved in both of those there's a constant going back and there's a constant potentially going forward and providing this information, not through a repository have you thought through it all how that those costs, who's responsible for that and how that's going to be handled. And to be clear, this policy is only prospective right so we're not expecting kind of retroactive compliance with completed awards, but as I mentioned the timeliness built into the policy is the expectation that data will be shared. Not the time of publication, or if it is unpublished by the end of the award and that's in part in recognition of the fact that, you know, once the award is over, we are going to have difficulty supporting the cost of whatever data sharing happens. We are working to develop more even more specific guidance as I mentioned on this point, but in general, you know, we are hoping that by including this upfront and having this ability to update plans as you go along. And then hopefully front load the cost so that you know you pay in advance in some ways, and, and then are able to absorb the cost of the data sharing on on the award and that's one of the reasons why the award is built in as the the sort of time points by which we expect data sharing to take place, so that neither investigators or institutions are left in a situation where they're they're unable to do that. So let me let me just follow up just just briefly. The expectation is essentially that these data are available in perpetuity, but my grant is over and unless I have some some way of providing or buying up front, you know, a no, no, you know, and no end license somewhere to put that data, I mean my my institution may or may not do that if I set up a website and I move that what you know the link gets lost and all that kind of stuff. So I'm just a little concerned that there's the, the, you know, five or 10 this is five or 10 years from now right that this could potentially come become an issue but there needs to be some way I mean maybe NIH needs to provide a if there's no other place to put it put it there kind of thing I don't know but some something to deal with that potential issue. Yeah, you know one of the changes that took place between sort of our initial policy proposal and the final policy begins to get at that a little. The draft policy did sort of have this expectation for sharing in perpetuity I mean it said something like share for as long as it might be deemed useful by the scientific community I think was the phrasing that changed in the final policy in in sort of response to comments that well that's unrealistic you know we can't budget for perpetuity and that might not be there might be limitations based on repository, you know guidelines. And that is change in the final policy and allows flexibility for investigators to propose reasonable timelines as part of their plans, some of which might be guided by, you know, if you're aiming for a particular repository the repository might have an idea of how long that will exist or based on limitations or scientific limitations on how long that data might be useful you know how much can you squeeze out of it. So, there is now flexibility built into the policy that allows investigators to talk through this in their plans. And in many ways, no one right answer here right it's going to depend on the kind of depend on the science going to depend on the data it's going to depend on the availability of the repositories. And again, different Institute centers and programs may want to build more specificity into a timeline but in terms of the overall policy. We shifted to try to provide maximum flexibility in that regard. Thank you for your last call for questions for Kerry. Seeing none Kerry thank you very much for this presentation thank you for the last five years of your life working. It's a big challenge, but we're happy to have you come here and address the council. Thanks so much for the opportunity. Okay. Bye bye. Council members were ahead of schedule nothing wrong with that. We're going to take a 30 minute break and reconvene at 145 Eastern time, please do not disconnect from the meeting. Silence your mics turn off your cameras get your lunch or your breakfast, and we'll see you at 145 Eastern when this resumes. Thank you.