 So, rwyf wedi gweld i'n ddim yn fawr i dydydd i'r bwysig o foennu, ond mae'n amser ac yna i'n neud identifio, ac yn cael bod yn cerddiannol eich godiw i'ch gynnau hwnnw i'ch gynglwadau'r fan hynny. A mae'r dweud yn ymrwyaf, mae'n gweithio cael sy'n gweithio, y gallwn ni'n gweithio i ddim yn gweithio i'r gweithio i ddylwn ni'n gweithio, oes weithio allu cerddiannol i'n gweithio i'r gweithio. goto wrth the four different presentations, but in many ways this is the chance if there was something that was a burning question at the time that that person or a burning point that you wanted to make or you wanted to make a challenge or you wanted to expose an assumption in a talk. This is the time to state that. Felly, rydyn ni'n gweithio gyda Mike's talk, a rydyn ni'n gweithio, Mike gave a talk mainly about justifying, from my perspective, a high sample size approach to common disease discovery. He gave these two things about the opportunities. It's kind of interesting, Mike, here, because in fact the real central point was we need 25,000 plus samples, at least that's what I took out from it. So do people have any comments, or want to challenge some assumptions, or want to make some points? Mike? Maybe to make one point, or maybe two points. One was, I really think, I'm naive in this field, but I get the sense that the key issues aren't more and more and more, but rather better phenotype, better phenotype, better phenotype. And I think if we had, that came out in this presentation, but it was a lower bullet. And I think that, for me, that's the number one bullet. I think you can match this information a lot better, if you have a very well phenotype cohort. And I think these complex diseases, I mean autism, they're all heading this way, they're all being stratified into sub-diseases. And I think, in my opinion, that's certainly one thing we want to pursue. So I think having a really, really good cohort is the key to all this. So does anybody else want to weigh in here on this? There's a phenotyping, quite whether it's a trade-off or not, I don't know. I think setting it up as a trade-off is the wrong thing. Heidi? So I think one thing that I see as a challenge in the deep phenotyping is if we're really going to get a lot of phenotypes and get them longitudinally, we have to engage the physician population. And there have been tools that have been set up, like phenodb, phenotyps, other tools, but we've been trying to pilot those with actual physicians and hit some major brick walls and actually setting them up in a usable environment for the physicians to actually use on a routine basis. So I think it's not just have we developed the tool, but have we really figured out how to implement it so we can actually take advantage of it in a broader cross-sectional way? Okay. So it's Evan and Steve. Let's get Steve. Go ahead. Sure. I mean, I wanted to just raise the point. I think that no matter how deep we phenotype as we go to rare or less common variants, we face a statistical problem of at what point can we agnostically say that this is indeed causal per se. And I think if we just look at the examples that have been looked at and reported in literature, a high fraction of them have a fair amount of laboratory corroborative data that's really necessary to help sort out and choose of the last 15 that are left standing when you're statistically looking at those. And so looking at melanoma with MITF and POT1 and now others, I think it's really crucial that whatever strategy we have goes back to the functional talk that we had a little bit earlier. It's just not the annotation, but how and in what way can you use that additional information and it really will stress us clinically, but I'm afraid we're going to have to use that until we have 500 million people sequenced worldwide and then the numbers fall out obviously. Okay. Evan? So maybe just to follow up on the point that was made before about engaging more clinicians in this process. I think the other aspect that's going to be critical, which we've already seen with autism, is engaging the families. So for example, there's a lot of variability, phenotypic variability, there's a lot of heterogeneity, and there's diagnoses being made all the time in these families that are being identified with rare variants. Being able to actually engage those families directly in the research programs that are being done is absolutely critical. Second point coming to really the issue of whole genome sequencing, time to do more. I'm just going to make the point that it's not just about doing more. I'm going to argue that we need to do it better than what we're currently doing. How was I guessing that? And I think one of the things that hasn't been discussed, but I think many of us know, is that not all the variation, the single nucleotide variation, and having been working on structural variation for a long time, we have not solved this problem yet. And there's a large fraction of genetic variation, maybe not in terms of numbers, but I would argue in terms of impact that still is being underrepresented in large part because we are not sequencing to do de novo assembly, but we're sequencing for the purpose of mapping right now. And we have to access not just the difficult regions of the genome, I think we have to access the complete spectrum of genetic variation if we're going to get to Mike's point about the genotype-phenotype correlations. Right. So I would just comment that I'm all in favour of broadly phenotyping, but we've crossed a certain threshold where we can get the genotypes very inexpensively and do a very good job of stratifying people now based on genotype and then devote our deep phenotyping to people who are concordant on genotype. And autism and many other diseases, I think, are replete with examples where we spend our money most wisely when we're phenotyping people who we know are sharing a common genetic cause of disease. Well, that would go to Mike's point that recall cohorts, the ability to do recall is very important here. I still feel that this is quite an important kind of trade-off here, which is kind of do we want to focus more on case control, which obviously has some power and selection sort of things, or quantitative traits and phenotyping. So people have talked about deep phenotyping more in a clinical sense, but what about phenotyping in terms of quantitative trait discovery and that sort of thing in normals? Is it better to do it that way around? Does anybody want to expand on the idea that we should do quantitative traits more of human biology and then look at those associated with disease or is this disease focus first the best way to go? Rex? Well, the two aren't mutually exclusive either. So, you know, there is a lot of quantitative data in electronic health records that you can get. So, in fact, some of the best power statistically for small sample sizes even comes from quantitative traits that you've gotten out of electronic health records data. So, you know, I think one of the things we should probably do is clarify what we mean by deep phenotyping. I mean, you can think of deep phenotyping by imagining that you're going to do, you know, 47 different types of imaging on every member of the cohort, or you can think of, which is how I tend to think about it, deep phenotyping based on what we capture in the course of clinical records supplemented by other appropriate things and then certainly by the ability to bring somebody back in. And most, not all, but most of the big biobanks and certainly most of the people associated with merge, but not all have that ability to bring people back in based on their participation. David? I actually think that the before we get to what we should do or what approach, those are all tactics, okay? It seems to me the question is what's the goal or goals and then what's the strategy to achieve those goals. And if we start this discussion from one of the tactics, I think we're not going to have a great success. So what I mean by that is, I think one of the things that makes this discussion often rich but also challenging is that different, there's many people using the same general technologies to do very different things. So some people are, and they're very mutually reinforcing things, but they're quite distinct. If you have a child or somebody with an extreme disease or phenotype, how best to understand the molecular cause of that. If you want to figure out a therapeutic, what's the best way to get a target? If you want to understand non-coding functions of the genome, how do you do that? They will eventually all come together and be synergistic, but there's no one tactic that's going to do all those things. And so I have a hard time knowing in the case of, I agree with Rick that just to pick one example, if you want to understand the phenotype of a mutation carrier or the perturbation of that gene, what does it manifest, you're going to do much better phenotyping people who share those variants that are strong in effect. If you wanted to understand how to interpret that variation in unselected individual from the population and incidental finding, you need a totally different approach in my view. You might need a different approach. And so we have to, we can't just dive straight into tactics, I would argue. Okay, so would you like to give your goals in this area, therefore? I'll buy that, I'll buy all of that. So do we need to discuss goals or is that a, because I mean we could go up a level and come back down in as it were. Obviously I think so, and I'd be happy to articulate some, I'm not sure that might be going too far for this first comment to see if others maybe agree, but no, I'm happy to articulate what some goals would be. It's ultimately, just say this, it's connecting human biology in the clinic and the population to the underlying cell and molecular biology in a way that's actionable, either for personal interpretation or for public health or for understanding biology or for therapeutics, but there's not one design that's going to get you all those things. So this comment is kind of David's made a comment that might overarch over all the different areas here and I don't know if we want to spend time in this area. Does anybody, who wants to have the discussion on the goals level? Does anybody want to respond to David? Does anybody disagree with the goals as stated by David? Okay, there was a little, there's a hand at the back there. Hard to disagree with, it's such a nice umbrella that covers everything. But I think one thing that comes with that umbrella though is that you really need to get a more systems level understanding then as well. I don't think it, I mean, you're really only putting a foot in the door if you're just doing genome sequencing. You really have to do a lot more, whether it's molecular phenotyping, you know, RNA-seq or, since it fits with the sequencing theme, you know, epigenomics, that sort of stuff. But I'm also a believer, of course you know what, we do what you do as many phenotypes as you can. And metabolomics, what you said earlier is rather inexpensive. You can get some very interesting phenotypes and pretty broad ones from that sort of information and certainly get a better systems understanding. I think the other aspect that I'd like to have in here, which wasn't quite captured in there is, I really do think you capture a lot of useful information from longitudinal studies. I know Heidi mentioned it. And so if some of that theme can somehow pervade this whole discussion in the, in swing two days, that would be nice. Okay, Mark has a comment. I was just sort of falling on that comment. I mean, obviously I think deep phenotyping is a great thing, but I think we should realize two things. First of all, I think that deep phenotyping is probably a goal in general of medicine, you know, in total beyond just genomics. And I think we should think about the aspects of phenotyping that really lend itself to relate back to genomics, which could be systematised and we can correlate with the genome as opposed to just in general. Okay. We spent some time on deep phenotyping. Oh, Mike, sorry. I was just going to respond to a couple of things. One comment I'd make is not all diseases are the same. And so in terms of do we want to do case control or do we want to do cohort, it's a different answer depending on the common diseases because there's a range of frequency for common diseases. Bipolar I think would be silly. Well, I don't want to say that. I think it would be generally the wrong choice to study bipolar disorder on the basis of cohorts unless the cohorts were very, very large because we're talking a frequency of 1%. If we're talking type two diabetes or obesity where you might have 20 or 30% of the people in the cohort who have those conditions, a cohort could be a really good choice. Particularly if you have a really well phenotype cohort that you can take advantage of not just the disease diagnosis but a wealth of quantitative traits. I think that can be an extremely good design actually. But it depends again on frequency. In terms of phenotyping, of course, we want as much phenotyping as we can have but if my choice, if a gun is pointed to my head and I get 90% of the phenotyping I want for 10% of the money and the case is often that, then I'm going to take the 10 times big sample with almost as good phenotyping and then exactly as Rick Lyfton has said, we go in for the people who really look like and be informative and get more phenotyping on them. Is that the perfect thing to do? Of course not. We want all phenotyping on all people but this is a good strategy that comes in between. And I totally agree with the comment I think Mike made in terms of longitudinal. And one of the reasons I love working in Finland is that on my cohorts or on my sets of individuals I can check every other year. Well, I can't. My colleagues can check every other year to see who's developed diabetes, who's developed different endpoints. And that's an incredible richness and an opportunity to do science that just isn't easily done otherwise. I thought it was just the saunas, Mike. I thought it was the saunas. I just thought you like going to the saunas. So we spend a lot of time here talking about deep phenotyping. I'm conscious of the time to make sure that we explore all the things. A couple of other things that Mike said that are partly on these slides is that the benefits of sample size by doing meta-analysis and consortium, broad consents and good data access. So he mentioned all of those four things. Now, are we just going to say that we agree that many of those things are kind of good things? There are lots and lots of details of making this work. I have a particular plug in for putting on my own very selfish about bioinformatics infrastructure. This phrase we need to do more than depositing is an important thing. Does anybody want to talk about the pluses and minuses or the relative emphasis here on those four things? Debbie? Of course I do. Whether they're really consented or not is an issue. And how much, you know, how broadly they're consented is an issue. And that is a problem in all the genetic studies that we're doing today. To get them broadly consented, they almost need to be reconsented, which groups are doing now. But the existing cohorts don't always have consents that are broadly applicable. So then they end up in DBSNP. They can't be summarized in a way that you want, right? So we need to really tackle this problem of, and that's what made the 1000 genome so powerful because anybody can access those samples, right? Get that data, use that data. And that's been the mantra of genomics, you know, being able to access everything. Well, so this, I feel like many people, we've had this conversation for 10 years. We have progress on it. But I don't think that means we're at the apogee of success in this area at all. And so does anybody, I would say that we do need to get better at this and do it better and more systematically put, I mean, there's the mundane business of money towards bioinformatics infrastructure is very selfish for me, but it's still enabling. Sharon? Well, I was just going to talk about, I was just going to comment on the broad consent that I do think we also have to be cautious because we don't want databases full of highly educated, predominantly white individuals who may have secure health insurance and employment who may be more comfortable with broad consent. That's actually a hypothesis whether that's true. But I do think when people talk about broad consent, we do have to look at whether that will impact our ability to really have a representative population. And I think one thing that didn't come up in general is if we're talking about these very, very large cohorts, we really need to invest the effort to have these cohorts represent the United States and not continue to be predominantly European Caucasian. Well, I can't see Carlos. I know Carlos made that comment as well on the importance of the shift in our legal frequencies between different ethnic groups. And Debbie, do you want to comment on this? I just want to comment the fact that we have to have people in science doing the science that are not like us, too, right? And that really helps actually get groups integrated. And we really need to pay attention to this. We do in genomics. I'm not saying we don't, but we need to do it more everywhere, every time. I mean, I think that's a big topic, and I think we should acknowledge it, but I'm not sure we're going to explore it in this 15 minutes. No, but you could target funding at groups, for example colleges or medical schools that have populations and scientists involved that want to integrate into genomics. That's not been done either. Okay, that's true. Richard, I want to move on soon. So last person here on this kind of large scale. Well, as you go ahead, I just want to say that I think that presentation so far, whilst appropriately focused, lacked a little grandness and that kind of explosive scale that we sometimes get to, which actually ends up being the driver of the next projects we do that we look back on and say, we didn't know how we did that, and then we forget to be that explosive for the next planning phase. So I just want to encourage everybody to think that at my point, because 1% of 10,000 is not much, but if we're moving into the millions, 1% of the million is actually quite a bit and enough to drive some of the design. So to be mindful of this while we think of the design. I mean, I thought what Mike suggested was pretty ambitious numbers, but justifiably ambitious and quite whether one needs more or less ambition in this game, I don't know. But I thought it was pretty high. Everybody would like 25,000 sequence people for that disease cohort, and that would be delightful. Okay? Sorry, I don't know your name. So perhaps this is a little taboo to mention, but clearly there is an initiative in the US which is the 1 million veterans program, which will bring a lot of, I'm sure people will get frustrated because of the ability to participate and openly share that data. But I think that's the type of grand scale that if somehow we could think about, how do we change the mindset there from an organization now that has collected over 300,000 DNA specimens and will probably do well over, you know, 30,000 exomes this year would be an interesting... Okay. Does anybody want to talk about that opportunity? It sounds good. Has somebody from the genome side explored this and thought about it? Okay? Terry? Yeah, as Patrice said, a major challenge with the million veterans program is data sharing. And for a whole variety of reasons that we won't go into, the data are not accessible and that's a non-starter for us. So we're really kind of stuck. I think if parallel programs develop that they would be open to the idea of cooperating and collaborating using similar protocols, that sort of thing, and maybe we can move them along, but it's been a real challenge. I see. Finland and Denmark, I think, are sweet spots as well. Okay, let me just try and close this off. Now, if there's anybody who really feels there's something about the large scale common disease stuff, I want you to feel empowered now to talk and get it off your chest. So if you just want to rage against something, this is the time. Okay, you're raging quiet. All right, so we'll move on a little bit to the Mendelian genetics. Again, the goal here is to get more things out on the table for this. Rod gave another, so these are the final slides from Rod of across his talk. I think drawing an analogy between the mouse program and the yeast knockout programs and the Mendelian genetics is really that same process for human, taking advantage of the human situation of the phenotyping and the hospital and everything and the size of the population, but of course having to also have this impact on the patients and very direct was very powerful. So what do people think about the area of Mendelian genetics? Do people want other thoughts about whether this is something that's obviously good to do, whether it's a mistake, we should do lots more of it? What are the challenges? I haven't stimulated you enough. It's good to do, says Eric Tick. Okay, what's a harder question for Mendelian genetics that how many Mendelian disease genes are we going to really find? I wanted to know what the rate sort of percentile across the program was, whether this was, do we need tenfold more centres? Would it be better to have tenfold more centres or to have all the centres fivefold bigger? Okay, Tom? Well, it's not about centres, but to a question I asked Rod this morning, I think that even Mendelian genetics we can learn a lot about modifiers and maybe again, I think that once you start talking about modifiers for Mendelian diseases, whether it's CF or anything else, it becomes complex traits again and needs large numbers. So I think understanding how phenotype differs in terms of severity. I see, that's almost looking at existing Mendelian diseases and asking more in-depth questions around existing well-established rather than discovering new Mendelian diseases, looking at modifiers around Mendelian diseases. There are a lot of kids with existing diseases that have been called, but we don't still understand. Plus it could also be in our way to discover new targets for therapy. Does anybody want to talk about modifiers around for Mendelian diseases? Do any of the Mendelian centres is that part of your plans at all? Richard? I think this is the same point that we all have the experience of going to talks about common disease and hearing, by the way, these can be informed by Mendelian examples. Then we go to talks about Mendelian disease and hear, by the way, there are instances here where this is important for common disease, but we never really get to live in that middle ground that Tom is pointing to. I would like to see out of our discussions here some design for projects that are large enough to capture the modifiers, considering, so Mike pointed out this is a false dichotomy and we will now acknowledge that, but we need yet to live it. Yeah, well that's a good point. Is there a good design, Rick? Just one thing that may be obvious about why this can and should be done on a scale, Rod kind of referred to it, is that these are rare, so you find one kid or one person with a variant in it and you might even be sure that it's the cause, are pretty sure. But when you get 100 with different mutations and the gene and genotype, phenotype correlations and prognosis and things like that would be possible. And I think that might be true in complex disease too. I mean, it seems like it would be. Okay. I mean, related to this or somewhere in this discussion, maybe it's related, I feel to phenotype penetrance and incidental findings and maybe we'll come on to that in the CISA discussion. I just want you to file that away because I think that the, you know, variable penetrance is the other side of modifiers to this. I mean, so if we all like it, why don't we have, you know, twice as much or three times as much focus on Mendelian disease area? I'm trying to be provocative now to get stuff out of you. Nobody wants to be provoked. So... Yep, okay, here we go. We've got Dave and we've got Heidi. Two things. One, I think the question is not, do we need 10 times more centres or whatever, is what limits it at this point? Okay. Good question. And a rate of 40 genome, you know, that rate of course will change with time as the low hanging fruit, so to speak, gets cleared out, but nonetheless, why does it need to be done in an organized way? I'm not saying it doesn't, but I think that the other piece I guess I would say to be a little provocative is that the historic separation which does continue to this day and has sort of instantiated a little bit in structure of like separating Mendelian disease and the rest of disease leads to the problem of overestimation of penetrance and sort of, you know, if all that would ever be done is sequencing other people like the ones in the discovery cohorts, it might not be a problem. But if the same genes and variants are going to be seen as incidental findings and other people, or if you want to understand that link, you're going to have to look at people at extremes, at people at less extremes. You probably need to look at unselected people if unselected people will be sequenced. And so one recommendation is that however it's organized to happen which could range from maybe it happens on its own to centres or whatever, it comes back to that data integration piece that we won't have a knowledge base otherwise. So, and that connects to what Richard was saying as well about trying to find designs that span as well or bring this more cohesively together. I mean, I feel like everybody is going to agree that we should all hug each other and get on and do that well together as a group. The practical problem is actually putting structures in place that make that achievable over time. And I think that's worth thinking about because again, a bit like the previous thing about data sharing, I think there's been progress over the last four or five years but there's nowhere near the top of that progress that Heidi, sorry, I missed out. And then there's someone like that, Heidi. So just to add on to that, you know, I think we, the successes that happened to date have been largely when there's been one or a few sites that have been able to amass multiple cases with similar phenotype to enable that solving. But I think we're now hitting the point where a lot of the other Mendelian diseases are just too rare for any one site to amass more than one sometimes case. And it's going to require a fair bit of infrastructure to enable the onesie twosies around the world to come together. And whether it's through patient registries or matchmaking efforts or things like that that we can really create an organizational structure to bring these incredibly rare cases to fruition. So there's a real international aspect to this in a way that, which in a way is necessary in a way which common diseases is just kind of a nice to have. Yeah. Okay, at the back there, I'm sorry, I don't know your name. I have to come off the back bench for this, sorry. Okay. Can't off it from Sloan Kettering. Yeah, yeah, I know. Just maybe our reticence is because all the brilliant comments have already been made, you know. So we just want just to reinforce actually two things that David Altshawr said. So first on the penetrance business, I have to come off because our group described the common modifiers of BRCA2 as part of the large international consortia mostly UK organized, but our grant actually carried that out. So we have a tremendous opportunity here. You know, we've identified, you know, this is breast cancer that's pending in France, you know, 15% or 90%. The mechanisms to follow that up won't come from the international consortia or even the NCI. So this post functional exploration of the mechanisms underlying this and it's an interesting question. Should, as David and I think, we take the extremes in these groups and now these are not rare. So there are thousands of individuals in the world carrying high-penitent cancer mutations who have not been affected and they've not been sequenced and there's not an opportunity for that to happen currently. And there's an enormous insight that can come from that and we need to, I think, pursue that as a high priority. And then the final point, just to underscore what David said earlier about, you know, we found a Pax V mutation in childhood leukemia, but how do you follow up that single mutation across the rest of the phenotype? It's the functional follow up of the single mutation Mendelian family observations. All these points have been made already. I'm just underscoring them. I think your brackery example for me was a good case of Richard's desire of a study design that stretched across this space. And I take, again, like Steve said, about this functional follow up and this need of understanding that at scale, which we'll come to again at the end. Now somebody else had their hand up over there on the left hand side. Debbie, come on in again. I should keep my mouth shut. No, I just can't help it. Not on the Mendelians. I just want to say that at least in the Mendelian programs, we get parents coming to us with their child and they want to know. So also thinking about how to engage the community more because there is great interest in this. I'm not sure we can, but some model of engaging the community more in the research that we do in genomics would be really good. Okay. Yeah, Harry? I guess I've been quiet. It seems to me we're being too pedestrian by putting these four talks in four silos. That was my charge. I'm sorry about that. I'm trying to stick to the rules that I was given. Well, it's not like you've never broken a rule before. But it seems like we could capture the excitement here by putting the talks together and think about this in a much larger way. Albert Camus, a French philosopher, said we're all special cases. You do a large enough sample size. You're going to have both case control and cohort studies combined. And I think we need to think about sample sizes that large. And I think spanning then this Mendelian oligogenic polygenic again with a very large sample size. I agree with Heidi's comment, by the way, of the very rare. I think that special considerations are probably needed. Okay. Let me keep going and then I'll be quiet and you can have the whole thing back. I also think it's need to come back to the translational implications. Despite what Jim Evans said, I think this is a grand opportunity for novel target discovery and prediction. We can't give up on prevention. We can't give up on prevention. And I think finally or not finally is the need to bring together these functional pipelines and these large-scale discovery studies, especially as we tackle whole genomes. I think discovery and functional pipelines will probably come together to help us interpret the data. And then now finally, it's an opportunity to recapture the excitement of the international excitement that was there in the old genome project days. We need to I think make this more of an international effort. Not worry about what the UK is doing, the US is doing, but think about what we can do together to push this agenda. Now I'll be quiet. All right. That was a bit like the Spanish Inquisition. You know, there's only three things. No, four things. Yeah. Okay. So what I'll do there is what I think we should do is I'll try and make sure that we reserve a bit of time or maybe blend a bit of time at the end to try and be big. So both you and Richard and David have said we should be bigger, bolder, more strategic. That's what I take out from this and we should know what we're trying to do in this bigger, bolder, more strategic world and make sure there's a binding together description of this whole integrated space. If I've got that summary right. Let me, so let me just make sure. So if you've got something burning you want to say about Mendelian genetics and stuff like that, now is your time. Make sure you state it. This may be pointless for me. I will stop doing this in the next three other two if this is pointless. Okay. And I encourage people who haven't spoken out in the next two to feel motivated to press a little red button to come up here. So if I could just take the last thing you were saying and suggest just to boil it down, all the groups we're going to have should report out separately goals and tactics and we should just have a very bright line between them because the goals, what do we want to get done with whether it's common or Mendelian disease or anything else, should be clear things, almost timeless things. The tactics are going to be a moving target and it would be good if each group really made separate statements around each. Well, I mean, I think many David's point and some of that, but I mean, I just want to summarize what you were saying as goals, tactics separate. Yeah. But I think there's a big desire here by the people who've spoken so far that there's goals across the whole thing rather than seeing siloed goals. Yes, if I get this right. So for example, from these first two, you might articulate as a goal to understand the contribution of genetic variation to phenotypic variation in humans. That sounds good. Then we can talk about tactics, common disease versus rare disease. Fully utilizing the allele spectrum. Okay. Richard, you're smiling. Do you want to come in here? Just happy. Okay, good. That's what I want to see. So, so Ewan. Ewan. Yeah. So, so David. Okay, David. So, so if the if that's the highest, you know, that's a high level goal and we say we want to functionally annotate variation first probably experimentally later in a computational or somehow becoming predictive, right? We want to be able to align, use this information to align laboratory investigations with prediction of clinical responsiveness in some ways. Then when we go to the tactics, we just to close it when I was trying to say before is then we're going to have to talk about which tactics are addressing which element of that big vision because there's not going to be a mistake to say there's one study design that will answer all those things. So we have to die. We have to have that intermediate level saying if you want to just discover the most functions of knockouts, you've got to go access knockouts and phenotype them. But if you want to have a better prediction for the for risk prediction in the general population, you need a different design than that. And if you want to know the mechanism, you need to do the functional follow ups. Yep. Okay. I still feel like I should do justice to all four talks because I think it's useful again. The goal here is to get more ideas out and on the table so that people can hear them. So let's go on to Dan's talk. The CISA programme. So it was very interesting listening to this talk for me because this is the point where coming from a European perspective, you start to see more diversity which is just triggered by the way healthcare is wired. And I think it's just fascinating and it's interesting to watch how this plays out. So how do people, what are people's thoughts around the CISA programme? I certainly feel that we should have some discussion about incidental findings, for example. Make sure, connect the phrase incidental findings to the previous two discussions about understanding genotype to phenotype relationships. Come on, somebody, comments on this? Yep, Heidi, wonderful. I mean, I just think it's an incredible opportunity to do genotype first and look at penetrants. And so engaging the incidental findings environment to do a lot more follow-up and phenotyping so that we can really start to engage those questions of penetrants which I think will also inform modifiers and other studies by taking a sort of genotype-first approach. And the way I think about this is that this line also plays towards quite large sample sizes because for you to understand that incidental finding and that variable penetrants, you need to find these variants which are at low frequency but in people who don't have the phenotype. So that also plays to large sample sizes and cohorts in this. Have I got that thought process correct? Somebody's nodding, why don't you press the button? Do you want to comment, Dan? No? Okay, shaking your head, Heidi. I was just going to add that the problem right now is we're mostly looking at a point in time for most of the studies that have been doing this and we haven't really been able to follow these patients out longitudinally and I think that's going to be incredibly important. Okay. Are there comments on this, Sean? But with regard to large sample sizes, the CSER projects are not that large because of the extra work that's actually involved in doing it clinically, reporting it back to patients. And so what we're getting is a sense of what happens then clinically when they go to the cardiologist and get a workup. But to really know the penetrants, you will need a study design that does allow for much larger sample sizes if you want to address the incidental finding. And that has to have this recall component, basically. I mean, otherwise it doesn't really make sense, because you want us something. Okay. Are there comments around this, Mike? Well, maybe just stimuli thinking. Perhaps we should be thinking about a world where lots of people are getting their genome sequence before birth and how are we going to deal with that? Because it does seem like that is the future. And I'm not saying when all that information gets returned, that's probably the controversial point. But I think it just seems like we're heading that direction. We don't have to talk about it now, but if we don't get started now, we'll just be a little bit behind the game if you ask me. Because I think people take this into their own hands regardless. And I don't think we're talking small numbers of people. I think we're talking large numbers of people who want this. There was a good discussion between Dan and Eric about regulation and regulators here. That's something that is very unique in some way to each healthcare system and is very important to tackle. Do people here feel that there's the right connection between the genomics community and this discussion and the regulators? I don't know. Is there somebody from the FDA here? Probably not. So Dan, do you want to have a comment on that? Well, so I advise NHGRI and I advise FDA. So I guess I'm as good as it gets for now. That's pretty good. I will take exception to what you just said, though. I think I would be surprised if EMEA doesn't view what FDA is doing and take it to even the next step. Because there is aggressive about regulating as the FDA is and then some. So I think that we're fortunate, though, that at least at the very, very top of the leadership structures of both NIH and FDA, we have scientists who are interested in developing those kinds of collaborations and understanding the issues involved. So they made a tiny step. They approved this sort of MySeq platform for a small number of CF variants, I guess. But that's sort of a step in the right direction and they'll work with us. I don't think we want to set up an adversarial relationship. I think we want to set up a collaborative relationship. I'll just say that again. Okay, we've got some comments here. At the back first and then we'll go forward. Yeah, I'm Mark Salad from NIST. We've been working with FDA a bit in trying to identify what infrastructure they need to do regulatory oversight. I'd say that my finger on the pulse of the Office of Invitro Diagnostics, they don't want to be left flat-footed without knowing how to do science-based regulatory oversight of clinical sequencing. So the question is, do the programs out of NHGRI, are they acknowledging what infrastructure is going to be needed so that you can have a reasonable path forward for regulatory oversight? What tools does FDA need so that they can do their job? I think you'll find great willingness and I think it's far from adversarial, but put yourself in their shoes and it'd be great to be able to make it easy for them to understand what's safe and efficacious. I understand. That's a good point. Yeah. Yeah, Robert Green from Boston. I think that the CSER and related programs are really now entering the arena of public health and whether it's the opportunistic screening of incidental findings that happens when you're doing sequencing for one reason or whether it's the very next step, which is going to be the question of population screening, either at the newborn level, the pre-birth level or in many of the adults who want this sort of thing, we're going to have to take this on and I think think bigger. We're going to have to take this on and think about whether this is an appropriate public health initiative and exhibit evidence-based outcomes in a way that even the newborn screening programs we have today didn't go through at their time. So I think that's going to be a big challenge for us and an important one. Yeah, people who know me, I've just had the experience of understanding health economics, which is quite interesting. They use hidden market models, which I got very excited about, but I think kind of bringing in health economics and public health thinking and the screening methodology is quite important here. David? Can I just note something that happened in that discussion that is one of these examples we should pull apart? So there was a discussion of FDA and regulation of, say, cystic fibrosis variants, which are diagnostic and also now being used to gate a $300,000 a year treatment. And then there was a discussion of babies being sequenced in utero with no clear goal, no clear goal but it might just happen and what do we do about that? And then there was the public health demonstration of value. Let's not mung these things together. I think one of the things our field does, like there are things where there's clearly diagnostic value and there's no question as to whether or not there's value and there's the regulation of what would it take to have a test and also reimbursement. And then there's things that we in our technology imperative say, well people are going to do anyway and what's the role of a regulator or a field in that, and we have one, but I'm just saying I think that they're quite different scenarios and we do well to separate them, not in one discussion. I think it's about, you know, an actionable mutation where it's just a question of how to deliver the test and the other like broad based population screening where there's not a clear value proposition yet. So isn't a good way to characterize, okay, Mike? Well yeah, I agree completely, but I think both have to be discussed. I think it would be crazy to just pick one or not. We focused a lot on the actionable sorts of things through this morning's discussion. I know Heidi's emphasised this. I'd just like to see this other. You know, I think we just have to shift medicine to being more proactive and preventative than what we're doing now. I mean these are bigger topics than this particular thing, I think. So Dan, do you want to come in there? I'm not going to get down into the details like you just asked us to. I just want to reiterate what David said and agree with him. And I think the word actionable has this sort of, you throw it around but no one actually knows how to define it. When I'm in my congenital arrhythmia clinic and refer a patient for sequencing of ion channel genes and get something back, that might easily be actionable. I find exactly the same variant in some neonate whose parents decided to sequence his genome for the hell of it. That might or might not be an actionable variant. And I think that that's a really important area that we sort of need to incorporate into the thinking somehow. Okay, there was a hand at the back. Find a microphone, ideally. As we venture in sort of talking about public health and talking about large cohorts, I wonder if we also need to consider deep exposure measurements at least for common risk factors for the big diseases. I think that's something that the geneticists have lost a little bit of focus with looking at gene environment. And, you know, if we want to have a public health impact, we may want to consider smoking, alcohol, diet, exercise as things that we must measure along the way. I noticed that was a question in one of the breakout groups. I don't know whether it's in how do we instrument the environment. That's how I think about that problem, but it's an important problem. Let's move on because I also want to give us five to ten minutes to be global at the end. So let's just go to Joe's talk. Joe was asked to give bullets. I noticed that he abused that quite effectively. So we have a lot of things. But Joe took us through the different functional approaches. There's quite a strong focus on non-coding function, which has changed a lot over the last five years. And things like the CRISPR technology, which opens out a way of being very proactive or very directed about how we go about these sort of biochemical and cellular-based assays. And I've always felt that we needed to fuse the functional approaches to the variation approaches for a long time. If anybody knows, I've been haranguing NHGRI for about seven years on this topic. There is this GGR programme coming up, which is more about prediction of gene regulation. But less about this. So how do I think this is all good? Are there comments on this? I can see some people already going for it. So I wanted to try to tie together the clinical talk with this talk, which is that in the experience for the cancer susceptibility genes, what is actually critical is aligning the functional assay with the clinical outcome of that assay. And so what I mean is really powering your assays of enough patients with variants in that disease who have a phenotype that you can then determine whether that functional assay has any relevance. So I agree with CRISPR-Cas. We're going to be able to do a functional assay on every potential variant in a gene. But unless we actually know that that functional assay is a good mimic of the clinical scenario, that data is not nearly as useful as it could be. Right. I understand that. Minolus? So I just want to build a little bit on my earlier comment about the epigenomic variation between individuals. I mean what CRISPR allows you to do is basically ask if all the context is the same and I perturb this one variant, what is the effect on the regulation, the cellular phenotype and ultimately the organismal phenotype. However, it might be more cost effective to actually, if the samples are available, if the tissues are reachable, accessible, to actually test these regulatory predictions directly on the individual in the context of everything else that's going on in the genome. In other words, if we're making a claim that this particular variant is having a regulatory role, then let's look, let's basically see if indeed, accessibility changes or histone modification change or TF binding changes and so on and so forth in the relevant tissue, in the relevant individuals, because ultimately that's going to get us not just the intermediate phenotypes between genotype and ultimately organismal phenotypes that basically tell us what are the, you know, gene expression changes and what are the regulatory changes and the specific elements in the specific cell types, but it will also, I think, get a little bit more at the rare variants. In other words, if there are rare variants in the region that are changing the regulation of each of these individuals in a different way, this will actually be visible in the chip-seq profile of that tissue, of that individual. Are you focusing, because Joe pointed out the power of having phased information and the use of allele-specific information to, you know, in those scenarios, you have this kind of match control and that really is very useful. Are you talking particularly about using phased and alleleic biases in the chip-seq or the iron disease? So for heterozygous individuals, absolutely, you can actually ask if the different alleles, in fact correlate with activity, but you can do that in many different ways. You can do that allelically in each individual across a cohort. You can also do that genotypically across individuals in both your disease cohort as well as control cohorts. So what GTEX has pioneered is this large-scale collection of tissues from healthy donors to basically ask about these intermediate effects at the gene expression level. I think extending that big time in the epigenomic space is very, very important. And matching it with disease cohorts is also very important, because every single time we are making a regulatory prediction and the validation is right there in the tissue of interest. I see. Okay. Are the... Yep. Go ahead, Appie. Three things that tie with what was said before me. First, the importance of a readout in the sense of a lot of these technologies currently work well with readouts that are incredibly simple. The cell is alive or dead or the GFP is expressed or not. And for a lot of phenotypes of interest and this ties in back with that, these readouts might be too far removed or we don't know the correct one to monitor. The second is about combinatorics at least when we're closer within the cell and we're thinking about these systems with a lot of moving parts in them, singleton perturbations might be far from sufficient to actually discover functionality for a lot of things. That is different with the genetic variants that we often map in the common variant studies for complex phenotypes like a person's disease, but it is definitely true within the context of the cell. And the third is which samples to do it in. I think we're very tied to a very small number of cell types that are very convenient to work with but are often quite far removed from the biology that we're interested in. Okay, that's a good point. I noticed something that joins here this desire to look at modifier studies and stuff like that. The CRISPR-style technology allows you to think about introducing an allele across a panel and then ask it say if you believe your functional asset is correct then you can go and chase down modifiers in a different approach. Richard's nodding now. I don't know if anybody has any comments. There are many other people here I know who work in this space. Did anybody else want to make any comments about functional stuff? There's another business here about... Go ahead. Eric said this morning that he wanted us to continue to be genomics trailblazers and I think if one goal is to relate genotype to disease phenotype and another is to advance to translation I think we have to pay much more attention to gene regulation. I think that is true for two reasons. One is I believe it's a mistake to believe the variation in transcription factor binding has a small impact. Many instances where that is not true and it's not true in domains that represent large regulatory regions. Secondly, there's a huge effort to develop drugs in the transcriptional and that hit transcription. And they tend to hit these large domains where variation creates a huge impact. So those mentioned this morning a $50 billion value of drugs that could be applied to new targets identified by these programs. I think and the CTCF program at Vertex. Eric Olson is the guy that did that and he now Vertex had a $22 billion market cap and Eric has moved into transcriptional drugs. So I think we would be advised to pay a lot of attention to transcription. It's fair enough. Are there somebody else who popped their hand up? Did I see it? David? I would just say if we summarize the first part of this, the first two pieces as you know understand the relationship of genotype, phenotype to human diseases. The second part just to summarize in the comment that was made I think is right, to have assays that read out those relevant functions in the way that it was said in the corner, I couldn't see who said it, but that triangulate on the disease. Those are simple deliverables like the human genome project was or like things that for a set first of disease genes and then it would become broader that would be truly empowering to the field. So in this area people haven't touched on the idea of using epigenetics almost as an alternative molecular phenotyping technique on humans. Sort of as a way of you know you measure people's blood chemistry and stuff like that. You measure people's transcriptome, you measure people's epigenome as some routine phenotyping now, thinking of it as a phenotype rather than as an input. Does anybody want to pick up on that? Is that a good thing? Am I talking bollocks? Does somebody want to say yes, no maybe? Nalis? So I said yes earlier so now I'm going to switch to no. Keep us on our toes. To basically say that we have to think of functional genomics and epigenomics as intermediate phenotypes. In other words, if there is no consequence on gene expression, on cellular phenotypes and ultimately disease phenotypes then I think it's just part of the natural fluctuation that our system is robust to. So we have to think about the consequences of non-coding variants in the context of organism's phenotypes, cellular phenotypes and so on. I think sort of cataloging all of the non-coding variation simply for these intermediate steps without the larger context will lead us to way, way, way to any false positives just because our system is tolerant. I see. Yeah. Yeah, so absolutely. So that's why I like to think about it in the context of these disease cohorts because very often you will see again, the reason why I was saying rare variants is because all of these inputs from common variants rare variants, environmental effects are ultimately converging and changing regulation before having a cellular and organismal effect. By directly measuring the epigenomics you get some of all of the above. Some of that variation, some of the regulatory epigenomic variation you'll be able to account for based on common variants some of them you'll be able to then go sequence more deeply and look for rare variants and others you will basically have to say well that has to be exposure that has to be something that accumulated through the lifetime that has to be environment, that has to be other cells signalling to that particular cell and so on and so forth. So I think when we think of them in that larger context they give you a handle, a point of convergence of these weak effects that you can then associate when you find an association you can then track it as opposed to simply saying well there's something changing here we don't really know what, well you can just go and measure what. So I'm going to ask us to think big in a moment and Rick and David because you expanded this idea in a moment I'm going to ask you to give us some overarching goals because you were so eloquent but the pair of you together so in a moment I'll ask you to go to that decide who's going to talk some words that I was surprised weren't mentioned whilst we were here so the word Mendelian randomisation wasn't mentioned throughout this time and I was wondering whether that was a good thing or a bad thing is there an epidemiologist who wants to get excited about Mendelian randomisation here, should we talk about this here? Okay Nicole has got her hand up and then Manolis Nicole I'm just spooky I feel you've read my mind I was about to say exactly that I think there's huge opportunities not only in traditional but also applying epigenetics applying molecular phenotype into two-step Mendelian randomisation and it's already been shown that Mendelian randomisation can predict success or drug trials before these are actually done so we should certainly use the richness of phenotype collection to do exactly this type of studies. Okay Manolis do you want to comment on that? Yeah so I think it ties into my previous comments that's why I'm raising my hand again which is that the moment you start measuring all these epigenomic variables you have to worry about causality in other words many of them are just going to be a consequence of whatever phenotype you're measuring I think the beauty of Mendelian randomisation and the reason why I think all of them have to be thought of jointly is that it allows you in some cases when the statistics all work out to really argue about causality to basically transfer the directionality information from the genetics to the epigenomics and sort of argue that whatever variable you're measuring is in fact causally associated but I have to caution everyone here that it's in very special cases in other words there are thousands of exceptions when Mendelian randomisation will not work and there's a few beautiful cases where it will so when we think of it at the scale of the whole genome with a number of multiple hypotheses that we're doing and the sample limitations that we'll have I think we can't just think of it as a magic bullet that we'll use everywhere I think it's a great statistical tool I think we have to use it in caution even when it does work we have to consider that there can be additional variables that we're not observed that could be modulating whatever intermediate variable that could be associated so use it with caution but always use it so before I go to reckon David are there any other words that people have surprised like why didn't somebody say uh does anybody want to come out with a uh yeah just to follow up I thought that was a great statement but I think the uh that I was surprised is in of one studies I think as we're talking about huge enormous data sets combinatorial axes we're going to have to come up with new outcomes methodologies to explore the impact of personalized medicine and it's not just words I actually agree with David that this needs to be rigorously approached and appropriately done but I think we're going to have to explore novel technologies of public health outcomes research as well alright so any more us people are surprised that weren't mentioned yeah so there was just a little bit about at different points people have talked about registries for identity for keeping track of individuals with loss of function mutations in genes sort of the human knockout catalogue as it were and I think considering the white space around Mendelian disease is also important that is what part of the genome can we knock out without serious phenotypic consequence because I think from the perspective of more nuanced analyses of rare variant studies we could use that information very effectively and really Mendelian disease genes are very special we know that rare variants at those genes that have a functional impact on protein affect the human health and there are lots of other genes where we know about rare variants that impact protein function in ways we can measure that we don't necessarily discern effects on human health so getting a better handle on that I think could help us a lot with downstream analytic approaches that would be much more effective and people have talked about that at other meetings not so much here today and I was a little surprised so that registry also goes to this recall component I mean all of that is really connected to a recall component okay let me go to David and Rick and give us a big picture view and some big picture goals and I hope what you're writing down will bind across all of them okay go for it so given five minutes so whatever to find the genotype relationship underlying human disease and healthy traits both of the means to illuminate pathophysiology and the organismal consequence of gene perturbation develop laboratory assays that report the relevant functions of these disease causing mechanisms create and make widely available the knowledge base needed to interpret variation in health and in disease do so across human populations to expand discovery, broaden access and as a matter of social justice oh wow that's not bad for five minutes congratulations to that man so there's a straw man aspect of this I thought that was excellent by the way but one of the things which is really appeals to me about this is trying to pull across this broad area not sealing these things as siloed it's a theme that's come out again and again is not Eric said this, Mike said this many other people said this is that these things shouldn't be siloed out into different sections which is a good thing there's a business of making that tactically work rather than goals orientated does anybody feel like there's a word missing from that and will maybe NHGRI can use that as again as a straw man but do people have a reaction to some big goals here Eric so I like this statement a lot I do want to note that in the service of pulling it all as one integrated thing there is an aspect of functional genomics that's left out which is genomics and the sort of things that NHGRI has done with some of its large scale activities have informed basic biology where the human is the best organism to look at weird new functions in the genome detailed aspects of regulation which in the limit fall under that but aren't going to fall under that in the relevant next five years and so I would endorse that as the key overarching flagship here but I think an important other ship in the armada might be to make sure that we don't lose that other aspects of the biology that can be extracted through functional genomics Evan So I just you mentioned it was there any word not mentioned that should have been mentioned by now I'm actually struck having been at these meetings I guess seems like forever but I don't think I've heard once that term evolution anywhere in any context which fundamentally underpins everything that we do in this room and I'm actually shocked that it hasn't come up yet and I think that in some respects and maybe the minority here I think we've somewhat lost our way in terms of the perhaps and I know I'll be the minority the over emphasis on translational and discovery at the expense of fundamental biology which is understanding genetic variation genome organisation and how we relate to all the other organisms on the planet So that was a great topic to bring up unfortunately I'm at the closing end of my time zone to try and try and expand all that area But I know there's some people let's have a comment there at the back Yeah I just wanted to expand a little bit on the last section where we were talking about broad populations because I think to really reach really broad populations we're going to have to go beyond the sequencing that's done by NHGRI So I think we need the words of implementation science or translational research to expand to very broad populations Okay I think I'm pretty sure Adam's going to tell me to shut up and close this This was trying to to not find conclusions but this was trying to bring ideas back out on the table and I think in fact that last statement was a really important one and it echoes a little bit Eric's statement of the impact on basic biology as well as joining up that but there's a big business here about human as the model and human genetics being very appropriate for healthcare and clinical stuff This means NHGRI has a headache and I will now I thank you very much for participating I'll hand over to Adam for our marching orders for this