 All right, and it funny that Julie would raise that because that is one of the things to address I can assure you We're on the home stretch. I have eight slides to show you so it shouldn't be that painful Hopefully not Wanted to start with kind of the overall goal of the workshop and the reason I'm up here as opposed to sitting back There is that I can actually edit from here where I what I can't do that there So please give us some some revisions What we have is is about three slides worth of of guidances that have come up as we've discussed both last evening and today Ignore the the numbers that are shown here. I don't have a pointer But but like the six and the 15 are just for me so that I know what what part of the discussion it came from But we had two kind of Conflicting views one was that we may need more finer divisions of population specific reference populations in order I think that was particularly in the discussion about imputation from exon chip data to sequencing data, etc Versus the comment that came up a little bit later in the day was that if we do enough people will have enough You know, we'll basically get enough reference so that everybody's genome will be covered Did do we have a consensus on on this point are both things true and and they're just for different purposes or comments on that So this was mainly I think sort of Maynard and Lynn perhaps a little bit with Peter in the in the discussion in terms of of imputation I think was the the population specific reference samples Would anyone argue we don't need more My takeaway from that discussion is we need a very large sample We need a very large sample but not really weird people We don't need no I think it would be a very big mistake to basically sequence Rhode Island And only sequence Rhode Island Yeah, I'm sorry. I should Rhode Island I Think we need to have a very large sample then within that we can do either either nested St. Resampling or Adjustment, but I think that we run the risk if we try to Design we're actually going to sample we're going to basically exclude certain ethnic groups because we want this pure Homogeneous population. I think it was Maynard that suggested Forgot the word, you know, we're mongrels and we need to embrace that history and Integrated into the science and not run from it Steven Use your microphone This is a very important point But I think it's going to be driven by the two critical decisions that take place before that is what disease and what cohorts or studies you're using to do this and you know as you're looking at the sample sizes and The how good the phenotyping is and the availability of things This would be in my mind a second a second tier an assessment of yes We need to be sure that we have addressed this in one way or another as we try and assemble as large a sample size as possible But I'm not sure it would be the first thing that I would think about would be the This the the population sampling, you know, because a lot of this is going to come down to what's really available You know in what do we actually want to study? You know something in heart disease or something in cancer or something in diabetes and Looking at that what studies are going to have those things are going to inform us as to how we need to address these questions Peter I just want to remake Rory's point which is Given that we all think diversity is important You can do it two ways you can either do it by some by having a large diverse sample Or you can have several different samples each of which is homogeneous, but you cover the diversity across the samples So we'll ask for both. How's that? And so we're needed. I think it's probably I think what people are saying What about all What do you mean all well instead of a large diverse population and several How would you you could have you can do one or the other or both? Well, I I guess I was hearing really we wanted a large diverse population We really wanted large. I guess that's not good large, but we didn't want Rhode Island So so we don't want whales. We don't want Kenya. We want Diverse or Right exactly So so so we want we want both Peter and so so I guess I'm lost then When you said one or the other I guess it is late in the day. Okay Small All right, but a terrier. Yes I also think that even with the sequence data that we have We could we could do some very interesting analyses and explorations to look at the effects of A population diversity. Oh, could you use the microphone or lean into it a little bit? I thought I was No, I I I think that even with the data that we have the sequence data Existing and that we'll be coming in in the next in the next well this year We can do some some interesting analyses that will inform us a little bit better About the effects of population differences on the distribution of rare variants in their relationship even to phenotypes so And I think that that would help to guide a strategy Once we just have better knowledge of That genetic architecture So so probably fair enough to say this is one where the jury's still out. Is that the supreme court has yet to rule Yeah, in my opinion, we don't know enough yet Let the data speak for itself as opposed to judicial Yes, and I and I would point out these aren't these are not prioritized. These are temporal sequence So and this again is sort of guidance. This isn't how we're going to select But these are you know, one of the reasons we're here is to give NIH institutes ideas about what they should be doing in sequencing So another that that chris had suggested was combining exome chip data on rare variants to improve assumptions about For power calculations and other things and a disagreement that that is a useful thing to do and not an easy thing to do No, it's a useful thing, but I I think that's a secondary analysis to the sequencing I mean, I I wouldn't let that tail wag the dog of what the decisions are for sequencing I mean, it's going to be a natural outgrowth of that I fear I may have started in the wrong place. So let me start instead With the signing because those we really really were the secondary things. Those were the guidance the advice to NIH on On that but that had been our our primary goal So let's let's go to the scientific questions that eric displayed previously At the beginning of the day, what I've shown here are his sort of top highest bullet level Questions then he had several sort of sub questions What's shown in white are the are his top level and then the yellow things are things that came up during the discussion So in addition to what what he had added So so we were hearing in in the genetic architecture We would like to learn about the spectrum of phenotypes with given mutation This was particularly evident in some of the Mendelian discussion that Mike and others gave us but others as well Potential specificity of treatments with specific mutations again an example In marfans or in in CF and identifying all the mutations in patients with a classic genotype So they have they have the delta f 508, but you you want to know what else may be there and may actually be underlying Are these somewhat along the lines of what was discussed you're looking you have furrowed browsed even I I think Do we also have the I think the point that Maynard had made last night and again today about Looking at phenotypes a bit differently on the basis of the genotypes the discovery of the the people that don't necessarily meet particular criterias of Of disease Is that what you mean by spectrum of phenotypes with a given mutation? Yes. Okay, just want to be sure Yes, june So since I was responsible for this topic regarding your second bullet point I think that we talked about this more emphasized last night was this decade was devoted to the biology Of discovery and there's a lot of kind of maybe a de-emphasis today of the omics But recognizing that's going to be an enormous challenge and that's going to require to modernize the discovery process We're going to it's going to have to involve a kind of new kind of discovery teams research teams and Realignment of incentives. Okay. So so that would be in the second bullet here needs needs new Discovery models perhaps Oh, you might under the not one. Okay, sorry Okay, new discovery models, would you say or teams? Yeah, okay. Okay, great Didn't do much with pharmacogenetics What I mean was the things that came up in the discussion where I think we're pretty much covered by what eric had had previously said Yes, uh, rory Just a comment on on power calculations I mean, I know power calculations can be done and of course they were done to justify All the g-west studies that were done and then they were found to be wanting um, and really they are, you know They're they're done to justify something based on assumptions that you don't know whether they're valid And they're really driven by how much you can afford Rather than by any real science And and that's what I was trying to get at with this point about Um, the the next few years Um, should not be on the Based on the expectation of getting results about variants for disease But learning how to use these data with with the expectation that in 10 to 15 years When you don't have to determine the study size based on economics, but you can just do as many as you've got Then uh, then you know how to analyze the data. I'm not at all sure um That power calculations Have much real value Well, they're entirely dependent on the underlying assumptions and and Which have been you shown repeatedly over the last five years to be completely unrealistic Felicia Trisha um I think actually Part of the reason we were talking about case cohort is a little bit of splitting the difference and maybe Maybe this is a case where I shouldn't argue with myself But rather say that in a perfect world. I think we would do exactly what rory said we would We would do a cohort and time would pass We would do some work cross-sectionally and in a few years events would occur, but I think that for um Enthusiasm and political support. We couldn't do just that. So let's just say without knowing power We would still probably put half of our investment in Sequencing a cohort an interesting cohort But we'd take the other half and we would try to pick some diseases that would yield Enthusiasm in the meantime and for those we would have to do power calculations and they would probably be wrong But but some of what you're hearing of let's do some disease Specific work and let's do some general work. It's I think I think we all recognize it is It's the NIH and we have disease specific institutes and People also want to see couldn't this work for a disease. So I think you're getting a little tension between Let's let's sequence a bunch of people Which is what we should do But in addition, let's go after a couple of promising diseases Does that does that help a little bit with I think the spectrum of phenotypes point there Mm-hmm It's on the power calculations is my my point was just if we have data It's relatively available or will be available in the next six to 12 months. Why not use it? Because I mean rory and the clinical trials many of those trials that you you led Were based on power calculations based on real trials that had been done previously And they weren't always correct, but at least we're based on data as opposed to no data And so that's that's that was the point. It may not be perfect data, but at least it's some data Yeah, I mean, I think this just brings us back to the point of thinking about pilots plural And you know, we're going to look at different things I I I would be really worried if we put all our eggs in one particular basket here because I think There are enough competing and very important Competitive, you know scientific hypotheses and ideas here and the question is how do we come up with a suitable hybrid? Of of pilots and so, you know having the disease specific issues and the modifiers and the emrs and cohorts We're we're I hope that we're looking at some kind of combination here a mongrel set of pilots as opposed to just Attempting one particular hypothesis So I think we had peter and then Nancy Or peter or you not you have your microphone on sorry Nancy so The other thing that I would bring up that we haven't gotten back to is I mean it relates to something that that paul said things that peter Mentioned in his talk and that That is it seems that we also want to in the pilots Set ourselves some goals That we know we can achieve that is Either by way of choosing phenotypes or choosing things that we're going to look at We will get an answer. So whether it's You know We will learn how dynamic the genome is because we will have samples That were collected from some cohorts over time And and we will see how dynamic the genome is Or if it's epigenomics, whatever it is That we know We will get answers about and that that there won't be people dissatisfied at the end that That we didn't discover anything new We will formulate hypotheses or Goals that can be met in some of the early pilots so that So that there's good good Consensus about going forward. I think that that might be important. There really is this pushback around genetics and genomics that You know, I mean if one of the goals was We are going to solve the genetic architecture of Whatever it is diabetes obesity um, I That's probably not a goal for the pilots So so one of the things we didn't talk about we probably can't solve this afternoon is is what Pilots would be but it's but it's probably something that we do want to address for for the summation that that comes out of this And so I'd I'd ask people to to be thinking about that In terms of what what we might want to do or propose in terms of pilot studies I think that the you know loud and clear message that we've gotten is not just a single pilot But it addressed multiple questions fair enough Okay, all right So it didn't really Fiddle much with pharmacogenetics. We did hear a little bit in terms of long longitudinal change In terms of being sure that we include both disease progression and lack of response to proven therapies I think Trisha came up with some good examples of that And and you know an interesting question and one that we might want to address is how would we tailor And remember these are the questions. These are not the answers How would we tailor the age that is most appropriate for cancer screening a very useful question? And and one that could be addressed in in longitudinal data Um One quick thing there that please in in your progression there that one of the things would be Multiple samplings. I think the longitudinal changes means Just not knowing on the phenotype. It would be very very useful And at least you know to be able to look longitudinally at the genome itself since we know about mosaicism And you know epigenomic changes and like that that's at least something to consider That would be embedded in that or the somatic genome for sure Yep, okay, then um health disparities epigenetics annotation of the genome We you know seem to be pretty much everybody agreed those were useful things surveying a large data resource dando came up with a couple of of You know nifty questions that I think are good examples of how we might use a resource What are the the missense variants in a gene of interest to a particular biologist? What are the phenotypes that are associated with those variants for a particular clinician? What are the variants of my patient's genome that are associated with disease etc? So those are sort of useful Example questions that one could use with a large resource. There are many many others We also asked the question in in what instances are genetics powerful for predicting undiagnosed diseases I have in what instances in in parentheses because I think the person who asked this question actually said are genetics powerful And I hate yes no questions. So so I think we're you know, we would never assume the answer to this could possibly be no So and and in effect we would say in what instances or what would make them more powerful Okay, so I don't think that we've exhausted the scientific questions I don't know that we need to exhaust them But when we get to this issue of crosswalking the criteria with the questions We do need to identify those questions that need specific cohorts or specific kinds of sample Collections that we haven't really been describing here. So yeah, please Let me ask Maynard a question. I believe it was the abramson report These pilots did the pilots tend to be technical feasibility? Or were the pilots baby steps on these bigger broader questions? Largely the model was to I think it's a mix of both of what you said They were intended to stress but not break the technology available Going into them And so that kind of set their scale Where they really differed from the way the NIH usually does business? And I think that the same tension is here now Is that They they were more technology and infrastructure driven the choice of them Then scientific objective driven So we were mostly talking about model organism sequencing and There was an initial sort of hierarchy of genomes e. Coli yeast worms and And human chromosomes Didn't play out exactly in the planned order, but it it it actually came fairly close But the criteria were Very much. I think in the sense that rory was saying Uh Are are we going to really know more about how to do this hard thing? After we've done this pilot project As opposed to are we going to have learned about worm development or learned about this or learned about that One thing I just would say I say two things. So sort of listening to this. It's probably not what the What One wants to hear at the at the end of an exhausted sort of workshop exercise, but As a really an essentially an outsider to this group, I I don't think there is actually strong consensus Here I think you're right trying to accomplish. Yes, and that if if if we try to fit The diverse views around the table into one kind of policy package, it's not going to be a very pretty one Uh, just do keep in mind that this is the NIH life will go on We have a uncountable number of disease specific institutes There is no risk uh over the next five years 10 years 15 years that there will not be many projects Funded by different institutes to study particular diseases and particular science driven questions I think what is more up in the air is will we actually Make a major change in the discovery model or perhaps we could look for other ways of phrasing that But uh this more long-term view that we're going to try to change the way we do business so that down the road The many different kinds of questions studied at the NIH will be studied differently than they are today Uh and that that really takes a different mindset. It's where the NHGRI came from This was not going to happen. I guarantee I was there and I watched this dynamic and it was not going to happen Unless very special mechanisms were set up With a different way of setting goals and so forth It can coexist I mean a small fraction Only a small fraction of the money that went into DNA sequencing and related technologies Was being spent through the the old nc g r and h g r i All through that and until the peak phase of the human genome project Lots of you know, all the other institutes. They remain very active using these technologies who say saw fit So i'm not suggesting forming a new institute, but I I think that there is a fundamental tension here There is and if we try to try that it's best to recognize it and find mechanisms whereby the different major branches of These interests can can play out right and I think it's it is important to recognize we had Multiple goals here So so one of the goals actually the stated goal was to was to give it an h institutes Which are many in diverse and many are represented here advice on what to do with sequencing because as church said people are coming off Say sequence my cohort. Here's why you should so that that was sort of one set of goals And I think that's where we have the most diversity Then there are you know some of the folks at that end of the table were saying let's design You know a great big cohort for a whole variety of of reasons and a large million person project That's kind of another issue and and one that we can address here But we certainly aren't going to solve and we can't put the two of them together necessarily Stephen did you have a comment? Okay We did promise people if we gave up the break we'd get you out by 445. So trying to to do that These again, you will see these questions again You'll have the opportunity to comment and and you know rip them apart and that sort of thing But but this was sort of the start at least we did get a few process questions One of the issues that came up is how can we reduce the signal to noise ratio and in whole genome sequencing data and human disease What are different strategies we might need to use for population stratification in rare variants? And the suggestion last evening was more specific reference populations Rory had raised the issue of can we use the lessons learned from the gwas era and and several people echoed this Especially from the multiple meta analysis that basically smashed together, you know all these Exquisite uniquely phenotype groups into into one lowest common denominator And what do we actually want in 10 to 15 years? And then there was also the question asked in terms of what data and biospecimens should perspective birth cohorts, for example, collect That might be most informative for future research This is another one that that came to eric and I would I would say for the purposes of this group for sequencing studies So something to consider in terms of process prospective birth cohorts will continue to occur Does sequencing change what they should be doing in terms of data collection? Okay The criteria for selection we've we've talked about eric actually gave it gave a lovely summary Which I've put into just the first bullet here. So I think everybody was in agreement. We should have large They should be large. There should be broad phenotyping. There should be longitudinal data You know, all of these are desiderata the ongoing contact Adequate consent for recontact. There should be diversity And to the degree that we can we should include isolated consinquency or otherwise unusual populations There there was discussion about variable disease progression being part of longitudinal data People should be sequenced to have EMR data available. That's not the only place one would look for phenotypes But if you if you have a choice between the two, it seems logical to try to use as much passive data collection as you can We we did hear about it It would be important in in whatever groups we choose to have the capacity to go deep to phenotype You don't have to start deep but to have the capacity to go deep when it's needed Recognizing that you can't be both deep and broad initially But as rory mentioned, it would be good to have In the multiple disease outcomes to be able to go much deeper than we have in the past in assigning disease outcomes And perhaps deeper than one might have initially gone in the phenotypes. Did I I get that right your point correct? essentially We heard from Maynard that perhaps minimal pre-selection is needed as we'll almost certainly need some follow-up phenotyping So you could start with simple EMR or survey data. You could you might even be able to start with 23 and me data that's questionable or Questionnaires we would like to link to family information or to robust concurrent family studies And we'd like to be able to to have other omic data Does anybody disagree with any of these as as things that we would like to have in large collections that we considered for sequencing One of the things we didn't do was prioritize these. I think that's yeah So one of the things we didn't do is prioritize these I might suggest just for the purposes of of friday afternoon that these seem to be our highest priorities And these were lower priorities. Would people tend to agree with that? Yes, I think we also You know look at this list and say this is where we'd like to be But I would want to be careful not to say that in order to start the discussion of any place You have to meet every every one of these criteria because There is a natural evolution to this process in both formulating what we do And the opportunity for people to go back and get other things I mean there's a huge investment, you know What five or six million people in different cohorts across NIH where you have opportunities for these synthetic cohorts and opportunities to bring You know go find other phenotypes so to speak and and people who are already enrolled in in these large studies. So Um, you know, it shouldn't be a sit in a Sinequan own criteria that you have to meet all of these It's can you do that over a period of time with not too much of an expense to get there Okay Yeah, I don't know where you drew the line with a light pan where the upper and lower Oh, I'm sorry. So the first bullet seemed to be those were all of the things that eric went over I would like to the multiple disease outcomes one is it could be a really important one that is if If you're looking for genes with pliotropic effects that came earlier in terms of looking for factors that influence risk or protection for multiple diseases Given especially the presence of of multiple morbidity that accumulates with advancing age I don't know if it's number one, but I just think that's that's a fairly high priority one Should not be forgotten. Okay. Would anyone disagree with with that as being a high priority Okay Good. All right, and then we're just about done, but we're not quite done So we need to crosswalk the questions and the criteria What we haven't really done is take some of those and a little bit more exotic criteria and link them back to some of the More exotic questions. We may not need family studies for for, you know, run-of-the-mill complex diseases We will definitely need them for, you know, rare diseases or or other or the kinds of studies So so we need to try and do that and eric and I are going to give a shot at that and send it to you We need additional questions that may require different criteria So so if you come up with, you know, additional scientific questions that still fit the criteria that we've identified, you know That's interesting, but it's not as as important. I think to our purposes as identifying Additional questions that you really need a different set of criteria in order to be able to answer this particular question We would like to play some priorities as we mentioned We will draft a thousand commentaries a thousand word commentary and distribute it Very soon that way we'll get the diversity of opinions We'd like to aim for two-week turnarounds on drafts and our hope would be that all the participants and moderators Would be co-authors to the degree that the the journal that is is lucky enough to get the submission from us It will permit, but we, you know, you must respond on the drafts in order to be a co-author So that's kind of the plan going forward Okay, and it's now five four forty four at least by that clock four forty four and a half So I maybe I can at least for n h g r. I thank everybody for the You know exquisite work that you've done and the time that you spent on this. Thank you very much for sticking it out Eric any closing comments Great. Okay. Thank you