 Name is Lucia Hindo if I'm a program director at NHGRI and I'm joined today by several of my colleagues at NHGRI as well as Dr. Demali Martin from NCI. So, a couple of housekeeping notes before we get started, please note that this this webinar is being recorded so that we can post it on the NHGRI webpage for this consortium afterwards. Please keep yourself muted until it's time for questions just to keep the the noise level down for everybody. And then the way that we're going to run the webinar is to start by giving about 15 or 20 minutes of overview by NIH staff followed by questions and answers. So, if you could hold your questions until after the introductory portion, that would be great. Any WebEx questions or issues can be directed to Catherine Solari at the email address you see here. Okay, so let's go ahead and get started. I'll start by explaining briefly the rationale for this exciting consortium. So, as no doubt, all of you are aware, polygenic risk scores are a burgeoning area of research in both epidemiology as well as clinical implementation. But the motivating factor for this consortium in particular is that there are increasing data that PRS prediction and non-European populations is poorer than it is in European populations. And these are some data from Alicia Martin and colleagues on the UK Biobank showing prediction accuracy across 17 anthropometric and blood panel traits. So, what you can see here is that relative to Europeans that Black Dot, the prediction accuracy over all of these traits is lower in American and South Asian populations. It's about 0.6 fold of what it is in Europeans. And it's about half of what it is in East Asians and then a quarter fold in African populations. So, we can see here that based on existing GWAS data that PRS prediction is vastly poorer in non-European populations. And what we would like to do in this consortium is really to accelerate scientific progress through bringing together data sets and expertise. These contributions include developing methods to improve the applicability of PRS to diverse populations, integrating large scale cohort data with existing data that are currently dispersed across different resources, increasing sample size in helping to improve performance of PRS in diverse populations. Developing approaches that can be applied to multiple health and disease measures. This is a characteristic of many of our genomic medicine consortia where if one approach is developed to analyze or approach or study a particular disease, it can often be applied to multiple health or disease measures. So, there is benefit and synergy in that regard. And then we're aware of a lot of other data sharing standards and efforts, including those in PRS that are being developed in other areas and other programs around the world. And we expect that there will be synergy with those consortia and programs as well. So then there are a couple of overarching goals for this PRS RFA. First, to leverage genetic diversity to develop methods and to improve the applicability of these PRS across diverse populations and for a broad range of health and disease measures. And then to optimize the integration of large scale harmonized genotypic and phenotypic data to facilitate collaborative analysis, disseminate PRS related data and develop related resources. So we're really focused on convening sites and data sets and expertise who can help us develop these collaborative analyses and disseminate the related resources that will help the broader community at large. There are several common consortium objectives. First, to identify and integrate data for relevant cohorts. There are a lot of existing data right now with differing levels of accessibility. And a goal of this consortium will to be to identify and bring together these cohorts that will be contributing to different disease studies based on PRS. Of course, this will involve standardizing the genomic and phenotyping data and mapping the traits to existing ontologies for harmonization across the different sites. Developing and applying methods to generating and refining PRS for diverse populations. Establishing external collaborations for PRS validation and implementation research, including those who are looking at clinical implementation of PRS such as the upcoming emerge consortium. And then identifying secondary uses related to health and disease research. And the idea here is that the infrastructure and the expertise that will be developed in this consortia could be broadly useful for studying other outcomes besides PRS potentially. One of the concepts that we wrote in the RFA was diversity first, and I wanted to take a moment to explain to you our approach to this. We are aware that there are a lot of other PRS studies, of course that are ongoing and even those in diverse populations. But as you all know, the most accessible data sets are likely to be those that have European ancestry participants. So for this consortium, we would like to emphasize the use of non EA data until the maximum value has been extracted from them before exploring data from European ancestry participants. And this is true even if the EA data sets are much larger and more frequently utilized. So we really want to focus first on the non European genetically diverse data. And so as you're putting together your applications, and if you are planning to include large numbers of European ancestry participants, we encourage you to describe the scientific purpose and pitfalls of using data from these potentially larger numbers of EA participants, and then to think and justify the resulting biases and, you know, think about more than just including these populations for a simple convenience or expediency. The RFA, the consortium is comprised of two different RFAs study sites and coordinating center, and I will start by describing the expectations for the study site applicants. So study sites will bring existing cohorts together to maximize sample size and genetic diversity for cross consortium analysis. They'll address collectively the challenges related to differing availability of clinical data, data use limitations and availability of summary statistics. They'll identify and harmonize health and disease measures for cross consortium analysis. Come up with ways to integrate ancestry into the analysis. Identify metrics for improving PRS prediction. We're finding PRS based on updated data that are accessible to the consortium. Participate in consensus approaches to developing and applying PRS. This is a really important point I want to focus on because each of the site applicants will come in with the strengths of their research team and the research questions that are interested in exploring. But for this consortium, we really want to take advantage of the synergy among the site. So we're expecting a lot of effort to be put in by each site to developing consortium approaches. To analysis and dissemination. This would of course include contributions to cross consortium working groups. And then the last point to emphasize is that we do want to form this consortium with the idea that we'll be conducting research and there's a section of the RFA that includes examples of research that you can think about as you put together your application and think about how this research fits into the broader goal of the consortium. For coordinating center applicants, we expect that they will provide overall and logistic and scientific coordination for the consortium. They'll lead the data science aims of the consortium, which relate to proposing fair approaches, findable, accessible, interoperable and reusable approaches to data integration and analysis. They'll work with anvil and external standards groups. They'll lead cross consortium genotype imputation efforts, cross consortium outreach and dissemination efforts. And as needed, they'll provide and convene LC expertise. This is such a hot research area and there's a lot of other related efforts going on. So we want to make sure that we have our pulse on the, on where other avenues for PRS research and conversations are happening, which could definitely include LC expertise. There are two additional components to the coordinating center to help improve the data that will be accessible to the consortium. The first is to provide limited support for affiliate studies who may not necessarily be funded through the RFA, but may be able to contribute data and expertise to the consortium. And then for the, for the first year only at the beginning of the consortium to provide some limited genotyping for populations or participants of unique scientific value who do not have genotyping data yet. So, because I've gotten a few questions about sort of what might the consortium look like, I thought it might be helpful. This is purely hypothetical a cartoon of what a consortium like this could look like. This is showing five hypothetical study sites, a coordinating center and several affiliate studies. And the points that I want to emphasize here is that the study sites can really encompass a number of different models that meet the goals of the RFA. So it might be that one study site is only including one cohort that meets the goals and then others could be including various numbers of cohorts as part of one application. So we're encouraging flexibility to meet the goals of the RFA. And I'm also showing how the coordinating center will be responsible for helping to invite and convening affiliate study sites as well. I want to talk a little bit more about the cross consortium focus of this RFA by giving you an example of how we see working groups coming together. So working groups are envisioned as the focal point for trait specific and cross consortium PR analysis. The study sites will contribute domain and analysis expertise. And the coordinating center will facilitate the research and the convening of the working group. So here's for example, a hypothetical coronary heart disease working groups where the cohorts might belong to different study sites. And then with affiliate study sites will come together and harmonize phenotypes and do PRS methods development and analysis. So this is sort of separate from each study site coming to a working group. We're looking at the cohorts that would actually be contributing data to analysis of that working group trait. As a consortium, there are several deliverables that are expected. So the first is project data sets with harmonized data. And these include summary statistics and metadata describing the cohorts. We encourage individual level data where possible. But if you look at the RFA, there are, there's a whole section on data sharing that describes some of the nuances and some of the characteristics of the data sets that might be valuable to this consortium, but may not necessarily be able to share individual level data. We're expecting that consensus PRS models that are developed by the consortium to be shared, including the SNPs weights and covariates that are necessary to interpret and apply the models. Tools and resources that are developed by PRS investigators policies and standards to enable data sharing, including LC. And then finally, any data and approaches facilitating validation and clinical settings. There are a number of other consortia, including emerge and other efforts that are likely to find results and resources from this consortia to be very valuable in clinical implementation of PRS. I'm going to stop here and turn it over to Dr. Ken Wiley. You will talk about the anvil resource that's described in the RFA. Thank you, Lucy. So I want to take a moment to just go over the anvil. This is NHGRI's cloud-based resource, funded cloud-based resource that this group will actually utilize. The anvil is actually a cloud-based infrastructure and software platform. In this case, it's built on top of the Google cloud platform. It's built to provide a shared analysis of computing environments so that investigators and users can have for their own collaborations as well as work with collaborations and consortiums can work in collaboration with other groups. The anvil will also provide data access and data security. This is on the same level as those provided by DBGAP. As a matter of fact, the anvil is considered an NH designated repository. In addition, it will house genomic data sets, phenotypes, and their related metadata. Because this is a cloud-based resource, a bit off of the GCP or the Google cloud, there are costs associated with using the anvil. Those costs include egress charges, computes, and storage. However, the anvil is also working with other groups at NH, such as the science and technology research infrastructure for discovery, experimentation, sustainability initiative, also known as the STRIDES initiative. That's being managed at a CIT to find ways to manage some of those cost controls. In addition, the anvil is not supposed to be focused on just power users. The anvil is, so user training and outreach is actually critical because we want this resource to be used for the broader clinical and basic genomic science community. The anvil is not designed to be developed as a silo. The anvil participates in the federated genomic data commons ecosystem, which is basically efforts that are initiated by different NH-funded cloud-based resources to make our resources more interoperable with each other. In addition, the anvil is focused on incorporating new scientific and technological advances as the community requires them. If you go to the next slide, please. So I've talked about this shared workspace and shared environment. I wanted to kind of go in a little more depth about what I mean by that. So in the case of this work environment, the work we call, which we call workspaces, these workspaces are being provided by TERRA. In TERRA, you will be able to have a workspace to do your own analysis, bring other group members that you would like into your workspace to do shared analysis, as well as use workspaces at the consortium level to work with other consortium level, other consortiums. So the workspaces will provide facet search, let you leverage established pipelines and workflows, as well as do exploratory analysis and exploratory workflows that you would like to develop to do your analysis. Instead of trying to have a multiple, instead of trying to piece multiple development utilities together, the workspace will also provide an integrated development environment. This will allow for improved programmer productivity by providing software for authorizing, modifying, compiling and deploying and debugging software. And because this TERRA, TERRA already comes in, you may have to click the next one, see the next slide, because you're just going to click through all of them. Yeah, thank you. Because this is, we're using TERRA, TERRA will come with Jupyter notebooks and widow for use. But because INNHCR has funded this resource, we wanted to add tools that are more common for the genomics community to use, such as Bioconductor, what plans to bring the UCSC Genome Browser, Galaxy and RStudio online on board this year. In addition, we have a dock store, anvil based dock store that will be for sharing containerized tools and workflows. Next slide. You can learn more about the anvil by going to our portal page, which is located, the address is located at the bottom right, anvilproject.org. There you'll have information about the different aspects of the anvil if you just click all the way through. You'll learn about what the anvil is, the data sets are available, the tools and training and resources that are available, as well as training, news and events and points of contact. The portal page will also provide access to understand how to submit data sets to the anvil, as well as a link to actually going to the TERRA, the anvil branded TERRA page where you can start your accounts and start developing using your work, your workspaces. And that's all I wanted to share with the anvil. Thank you Ken. And next we have Dr. Damali Martin sharing a little bit about NCI and their involvement with this RFA. Damali. Hi, good morning everyone. First of all, I'd like to thank our colleagues at NHGRI for inviting us to be part of this initiative. And so, as many of you know, the mission of the NCI is to lead, conduct and support cancer research across the nation to advance scientific knowledge and to help all people live longer, healthier lives. And we do this through a number of different ways, including funding cutting-edge research on cancer causes, treatment and prevention, training the next generation of researchers, and funding cutting-edge research such as the ones that will be funded under this initiative. NCI is fully committed to address cancer disparities and the lack of representation of minority populations in genomic studies underscores the urgency to ensure appropriate representation in large genomic initiatives such as this one. Next slide. And so we're really happy and excited to be collaborating with NHGRI on this initiative because this RFA is consistent with NCI priorities to address cancer disparities among minority populations. We believe that it has the potential to address what we see as a major translational gap in genomic medicine and for future utilization of genetic information and the prevention and treatment of cancer. It also aligns really well with our priorities under the NCI Moonshaw Initiative, which many of you may be familiar with, but the Moonshaw Initiative aids to accelerate the pace of discoveries that enable better therapies while improving our ability to prevent cancer and detect it at an early stage. And so under this RFA, the NCI plans to fund one of the PRS centers, and while the RFA is not disease-focused, we will only fund a center that's focused on cancer. And I'll stop there and turn it back over to Lucia. Great. Thanks so much, Damali. So we're getting ready to wrap up the introductory portion here. I did want to point out some other RFA sections of note. I've covered most of the high points of the RFA, but I would like people to particularly take a look at the data sharing in this initiative section to learn more about the expectations for data sharing within the consortium and external to the consortium. It's always worth exploring the some of the more formal sections of the RFA and what we consider to be instructions to the applicants. A lot of this is described in the program formation and governance section, which describes how the consortium components fit together. The PHS 398 research plan has specific instructions for applicants who are applying to each of the RFAs. And then I always like to encourage people to take a look at the application review information. So this is what peer reviewers will be provided for as far as criteria for review of your application. And then a little bit more about the review and selection process from the standpoint of the funding agency. So we have some criteria in there that describe programmatic priorities for NIH. Okay, so I'll conclude there and then we're going to try and open this up for questions and in orderly way. I thank you for everyone who's joined us. We have quite a few people on. Let's start by letting people unmute themselves or typing in the chat box and we'll try and take your questions one by one. So the floor is open. Catherine, you might have to help me. I'm not sure if anyone's raising their hands. Okay, let's see. See a question in the chat. So should the proposed phenotypes be available across all the affiliate member cohorts in a study site? So I'm going to change the wording of the slightly we have a specific reference to affiliate members that we talk about in the RFA that has to do with the coordinating center recruiting affiliate member. So I'm going to address this as if the cohorts in this question apply to a single study site application. So the goal is basically to maximize the phenotype that are available across the entire consortium, but of course as one applicant puts together their application, it's impossible to know who else is implying. So within your application, I would suggest that you maximize the amount of phenotype information that's available across the cohort. So this may mean across the study site across multiple cohorts. So this may mean that not every cohort has every phenotype, but you should be very clear in your application about the value across the cohort and provide sample sizes as appropriate for the different traits, if not everybody has every phenotype. The second question was should all the affiliate member cohorts have genotype data at the time the application is submitted. So the way that I've addressed this is that when you put together your application, you should be very clear about what your application will be able to contribute in terms of genotype data and phenotype data. Of course, sites that have datasets where the genotypes are already completed will be perceived as being more ready to be analyzed. We do allow for some flexibility if there is going to be genotyping available during the course of the project period, but you would need to describe that in your application and you would need to describe the readiness of those genotype data to be available. If the genotype data are potentially available but don't have funding yet, for example, then I think you need to be clear about that as well. So the basic goal here is to provide enough information about the datasets that could contribute to your application that reviewers and NIH can evaluate what that application would actually bring. All the in-person consortium wide meetings be in the U.S. This is implications for the budget. Yes, I would make that assumption. Please confirm that the total budget $1 million includes the indirect costs. I believe this is for the study site applications and, yes, as described in the RFA, the $1 million is a total cost estimate, which includes indirect costs. Okay. I see a couple other chat questions. Let me see if there's any quick follow-up about what I just said. Was that clear? Question. Oh. Yes. The explanations were very clear. Thank you. Okay. Thanks. Do we need to budget for Anvil support? Short answer is yes. There is a section in the RFA that describes how data sharing will be conducted and how the data integration will be done on the Anvil. I might turn it over to Ken to describe a little bit of the nuance about how those budgets, how you might think about putting together those budgets. No problem. Yes. So as I mentioned before, there is cost associated for the Anvil to include egress, storage and compute. Those are based off of marketing prices. And so there is a, since it's built on Google, if you went to Google's website, you can get a sense of what those costs are. There are opportunities for us, as I mentioned before, to try to provide some resources to help address that, but these are still in development. As I mentioned, the strides effort, we're beta testing strides in the Anvil now and we're still working on that pilot. And so my recommendation is that you budget for the market prices for using this resource. When we start making transitions to moving data to make them available to the public, then there will be, there may be the opportunities to have costs offset for that. Because we're working with the Anvil team to help facilitate ways for making data that's available through open or controlled access to the public, have those costs restored offset. But aside from that, those costs would be applied to the user. So you should budget accordingly for those costs. Okay. Thank you, Ken. Let's see. I'm going to keep going down the chat. Does the coordinate center have to develop the same set of activities as a study site or will it have only coordination activities as described in the slides. So the slides really are a really high level overview and very brief. So I would encourage CC applicants to read the RFA, but I will answer this question in terms of the overall consortium goals are the goals for both the coordinating center and the study sites. And even though I've highlighted specific areas where we're expecting the coordinating center to take the lead within the consortium. They are full members of the consortium. And so we're, we're relying a lot on the coordinating center for scientific leadership as well as kind of facilitating and coordinating. So I would encourage applicants to read the RFA and again some of the instructions about the specific areas, scientific areas that we'd like applicants to address there. Okay, what's the difference between affiliate study and cohort in the diagram of page 9 of your presentation. Let me just go back there. And we can take a look. Okay, so it's this one. Okay, so cohorts are depicted here as data sets cohorts that are submitted as part of a study site application. So someone who sends in a grant application for a study site may convene in this diagram anywhere from between 1 and 6 different cohorts that they're going to bring to their application. So those applications are funded, but then of course there will be other data sets that aren't funded as part of the consortium study sites, but may be evaluated to the consortium. So the coordinating center will be tasked with inviting these affiliate studies who aren't funded as part of the study sites, but might provide data sets and expertise. And if you read the coordinating center RFA, there will be some funding provided to the coordinating center to pay for limited personnel for these affiliate studies to participate in the consortium. Okay, I hope that was clear. And we do computing on our own local cluster. Can I think that's one for you. So that's actually a very good question. We are encouraging groups to use the anvil, but we understand that due to both geographical and situations that you may have to do some at computer on your local cluster. So we, we understand that we would be, we, but we are encouraging people to use anvil for this for this effort, but it's not an exclusive one. Thanks, can can you take the next one to the RFA said that the storage costs will come from the CC budget and is ambiguous ambiguous about computation. Yes, so the RFA that the storage costs will come from the courting center budget and that is. Okay, so that's a good question. So there are, this is some we'll have to work with the courting center on, but because there's data sets that you're going to be uploading that that may be covered by the courting center, but. You may have separate data sets you may or ancillary data sets you may be bringing will be expected to be covered by the site. So we'll have to work with the courting center. On that, but I would again, I would have you budget accordingly for the storage. And then we can work with you if we need to, if those, if those costs need to be modified. Yes, that's the second part of that question. Can you confirm that study sites for egress and computing since like, yes. Wait, I'm sorry, I'm kind of lost. Sorry. Yes, yes, yes, yes, yes, yes, sorry. Yes, sorry. I saw another one that popped up and I kind of lost. Yes. Okay, good. Yeah, so egress will yes that the short answer to that question is if the RFA says storage costs will come from CC. I mean, is it ambiguous about computation? Can you confirm that SS should budget for egress and computing? Yes. Great. How are you planning to evaluate an application if to study sites contribute with overlapping cohorts, perhaps with different phenotypes or overlapping cohorts. So, this is a question I've been getting and the simple answer is that we are starting at the level of encouraging. If people belong to cohorts who are actively analyzing data and who are active members of those cohorts, we're encouraging cohort leadership to kind of come together and help prioritize the applications that will be submitted to the RFA. That's not always easy to do because some applications may be including multiple cohorts. So, the way this is going to work is that of course, each application will be peer reviewed and that each application will be reviewed on its own merits. But at the time that the NIH is ready to make funding decisions will obviously take a look at potential overlap among different applications that scored well. And we'll have to work with the eyes of those applications to minimize the overlap among the cohort. So, I think we recognize that there are cohorts that are likely to be proposed across multiple applications. But NIH, of course, will not be funding duplicative efforts. So, that will need to be evaluated after review to the extent that cohorts are going to not be able to address this at the time of application. Yeah, I think that's how I would answer that. Was that clear? Are there other questions related to overlapping cohorts? We've been getting quite a few people who have had that question to me over email. Okay, if not, I'm going to move on. Are you expecting diversity within a study site or can a study site represent a single non-European group of cohorts? So, I might need a little bit of clarification. Diversity within a study site as reflective of multiple non-European groups. Really? Sorry, go ahead. Were you expecting to have European and non-European groups within a single study site or can they all be non-European? They can all be non-European. So, let me put up a slide. Let's see. That is a visual summary of some of the language we had in the RFA. So, we had some strongly encouraged criteria, which is at least one non-European group with at least 10,000 participants, or if you have at least 20,000 participants, 50% of whom or more come from non-European ancestry groups. So, they can, you can include only non-Europeans or you can include Europeans, but we ask that people meet the criteria of having at least 50% of participants come from non-European ancestry group. And then you can see some additional higher priority criteria here. So, we're encouraging everybody to at least meet the strongly encouraged criteria if you're going to apply. And then please take a look at these high priority criteria as they will be factored into funding decisions as well. Is there a limit to the phenotype table in the appendix? One table, multiple tables merged across. I'm not sure if you're asking about a page limit. I don't believe that as long as you're responding to information that the RFA is requested that there is a limit to the phenotype tables in the appendix. As far as how you present the table as if it's one table, multiple tables merged across the cohorts, I would encourage applicants to think carefully about how reviewers might be able to best get a sense of the value that you're bringing to the table. So, yes, you probably could put together multiple tables if you're, for example, putting together five cohorts, but think about how easy it's going to be for reviewers to get a sense of the value of your data. I think if there's anything that you can do to make it easier for reviewers or for NIH staff to understand which participants, which phenotypes you're bringing to the table, I would encourage you to think carefully about how to present those data. Are there funds for new genotyping? So, I did mention in the Coordinating Center of Priorities that we would be providing some limited funds for high value cohorts. So, the way that will work is that after the consortium is funded, those high value cohorts or participants will be identified. So, if you're a study site applicant and you would like to pursue genotyping, that is not something you should include as part of your study site application. Don't request budgets for genotyping because those won't be considered for the study site applications. Some people have asked if we have high value cohorts, but they're not genotyped yet. Is it okay to mention them in the RFA? I would say they shouldn't count towards meeting the goals of the RFA, the common collaborative goals that I mentioned before, because they're not yet available. But if you have ideas for examples of high value cohorts that the Coordinating Center could be considering for the limited genotyping, I think that's fine to mention. Okay. Diwali looks like this one's for you for the NCI funded study site. Would this study site be expected to only cover cancers or would cancers be one of several different diseases? That's a good question. Preferably, we would prefer grants to only cover cancers, but if there is a really good grant that application that comes in that is looking at cancers and other diseases, we will also consider it. Okay, thanks. And yeah, I think a lot of the cancer cohorts will have, I mean, their primary focus is, even if their primary focus is cancer, they may have other phenotypes that could be used for cross consortium analysis as well. So, I mean, I would encourage people to obviously emphasize your cancer strengths, but other phenotypes that you might also have available. Yeah, absolutely. Okay, great. Suggestions for how the Coordinating Center should budget for storage and collaborative analysis on anvil when sample sizes and data volumes are unknown. Is there a floor ceiling on data size that should be used for budgeting purposes? Can I'll start and then I'm going to ask you to chime in, but I think that, you know, as far as the number of sites will know from the RFA, you know, we have a range of sites that will be potentially funded. And then the study site criteria, which you see here is definitely sort of a floor for what we hope people will come in with. But I agree. This is probably something that we don't know necessarily what the ceiling will be. So, can I might ask you to chime in to see if you have any thoughts for how people should budget with this uncertainty in mind? I mean, see, you pretty much covered it. It's the idea you start with what was in the RFA as your foundation and then, you know, budget according, you know, from that, with the understanding that, I mean, as we get more and more into, you know, the program, get a better sense of what we're looking for. You know, we can work with you to make adjustments that need be, when I say we, and I should work with you make adjustments need be to cover if the costs are varying from what was originally proposed. But for now, I agree. Start with your initial budget for what you see in the RFA and go from there. Yes, and NIH is going to have to do some work as well in preparing to fund this consortium to take a look at, you know, the number of not just the number of study sites, but the number of cohorts and the number of sort of places that these data reside because I think that could have an impact on the budgeting as well and how many sort of touch points to anvil we need. So hopefully that was helpful, even though we don't have a firm number that you should be using. There's some first principles for you. I'm in day to day sharing if some studies cannot share individual level data, could they be still be part of some of study sites by running you oh one analysis in house so every application is probably going to address this a little bit differently. You know, some people who who are only proposing to use individual level data who that can be shared with everybody I think that's a much more straightforward model but if you read the RFA we do allow for some flexibility and for example if some data sets aren't able to contribute individual level data then you know describe the way that the summary statistics can be used. So I think when you're thinking about limitations about data sets that you're thinking about using. The key thing to think about is tie them to the goals of the RFA and be very clear about how those data will be used so I think you could still propose to include data sets where summary statistics could be shared but the onus is more on the applicant to describe how those data sets will be useful to the consortium and propose ways that they could be shared with the consortium, if not through individual level data. Well cohorts may have hundreds of phenotypes available that could be used to evaluate new PRS methods. Should the study site proposal select a specific number of phenotypes to evaluate as part of the project. Oh, this is hard. Yeah, hundreds of phenotypes is a lot. I mean, I think it's going to be a judgment call in terms of how you describe the value of your application. So, you know, I'm guessing hundreds of phenotypes could be like ICD-9 codes, for example. So an applicant is going to have to describe to a reviewer and to NIH sort of what the primary strengths of those applications of their cohorts are. So I would suggest, again, going back to the goals of the RFA and then really highlighting the primary value of the cohorts, but, you know, describing the potential breadth because I think we will be looking for some flexibility once the cohorts are convened to include phenotypes that maybe a site didn't think about making as a primary phenotype but would still be available for cross consortium analyses. So, and I think that will probably will be complicating that appendix table that we spoke about either. I don't know that including hundreds of phenotype tables would be helpful. But again, I think, you know, you need to use your best judgment as to what phenotypes, what traits, what participants are going to contribute the most to the overall goals of the RFA and then tailor your application that way. And if you have specific questions, feel free to follow about your dataset. Feel free to follow up with me over email. Okay, can one next question, can one propose to use public data only eGDB gap. So I have gotten this question about what datasets are appropriate to propose, can we propose datasets that are already available. And the answer is, so I'm going to refer back to the study sites and I'm assuming this is for the study site. So, you know, think about how you're going to meet these criteria, it could be a combination of public data as well as other data that aren't publicly available, but consider that if the public, if you're only going to be using public data, public data are available to the scientific community at large. So the onus is going to be on you as an applicant to describe how you're going to use those data to meet the goals of the RFA and what specific strengths your research team will bring to using those public data. So short answer, yes, but I think there needs to be a little bit more thought and more work done in the application to explain how those public data will be used to meet the goals of the RFA. And then also, if you're planning to use public data, I think there is still the onus to describe in the application, the ability of those public data sets to meet the data sharing section of the RFA. So, you need to have some familiarity about how those data could be used within the consortium as well. So we're not just looking for lists of data that that could be used from DBGAT, for example. Okay, we're at the end of the chat. Let me, let me see if anyone wants to chime in with the question. It looks like we got one more. Okay, thanks. Please clarify how many affiliate members the CC should plan to support for travel to in-person meetings. RFA suggests that CC supports up to 10 affiliate members traveling three meetings a year, i.e. 30 trips annually into supporting. I'm sorry, I'm not, I'm not sure I understand this question. Are you saying that the RFA language about supporting up to 10 affiliate members for three meetings a year isn't clear? This is Terry, I guess I just wanted to confirm that that interpretation was correct. Yeah, I have to go back to the RFA actually did it did it say 10 affiliate members traveling to three meetings a year or traveling to annual meetings or something like that. I think whatever we put in the RFA is the guidance to follow, but I don't have that number off the top of my head actually, but maybe we can follow up on that if it's not clear my apologies. Thanks so much. Any other questions? Any comments from NIH staff? Let me, let me, while I go to the final slide here. Any other questions from NIH or any other comments from NIH staff about the RFA? No, nothing for me. Thank you. Okay, thank you. So, I will just leave you with this final slide here. If you have additional questions, feel for, feel free to contact me for NHGRI and Damali for NCI. Let me mention one thing, which is just keep an eye out for any additional notices related to this RFA. In the webinar, of course, you all found from the notice, hopefully, so thank you all for joining. I want to let people know there is I've gotten some questions about the due date for the RFA. It's currently June 23rd. If that due date gets extended, you will see it posted in a notice that will be coming in the NIH guide. So you might want to keep an eye on that. So no official word at this time though. If there are no questions, we will go ahead and thank everyone for joining this applicant webinar. We thank everyone for their interest. We're really excited about this consortium and the applications that we'll be getting. So thanks everyone. Have a good day and stay well. Bye. Thank you. Bye. Thank you. Thank you.