 So thank you very much to everyone for an incredibly productive meeting and thought-provoking meeting I'm I'm totally aware of the fact that all of our brains are relatively fried by this time of the day Certainly mine mine is so I I don't think I can talk for that long What I wanted to do though was just go through some of the just the critical points that have come up again and again In the course of the meeting I'm not going to rehash the points that emerge from each of the separate working group summaries I think the documents that were circulated beforehand give a pretty good guide and we'll we'll be providing In the very near future the sort of what we saw is as the key points emerging from each of those discussions But in terms of the general points that that I saw is coming up again and again in the discussion today The first really was this this idea that we need to start adopting an explicitly statistical approach Across an analysis in in a wide range of different diseases and I think mark mark made that case very well And also the point that this this is certainly possible even in the case of rare disease So it's not the case that if we have very small sample sizes We can't at least make some attempt to define a rigorous statistical framework for analysing the like the probability of causality of variants that are seen in those cases So here the idea of considering patient sequence in the context of very large sets of reference sequence data I think does give us a no the possibility of approaching this rigorously the second point that came up in multiple groups presentations was the idea that Experimental or computational support for a variance impact on biological function is is not in any way of replacement for compelling Statistical evidence so it should be weighted in some way as being as contributing to the level of evidence perhaps But it's not sufficient to to push weak statistical evidence over the line And maybe there is still some controversy about that point But I mean I felt that was seemed to be the general consensus of the group a Key issue that I think it's important to highlight if only because it's not widely appreciated by by mini groups who are entering the next generation sequencing space is the degree to which sequencing technology is still immature and we heard Some examples of problems that arise from NGS And Gonzalo and I can both certainly testify to the challenges that we've had with with small insertions and deletions Predicted highly functional variants and the the issues with with validating those So the key points here are really that there is a strong need for stringent quality control at the outset of a study For for applying variant filters that are appropriate obviously balancing false positive and false negative rates And also this is key need for independent validation, which I think is very familiar in the clinical context But not not always Thought about in an insensible way in a research research setting We we also had a number of times about the dangers associated with cryptic multiple testing As we enter the world of rare variant analysis in an aggregated in an aggregate setting where it's not clear the unit of aggregation Frequency cut-offs exactly which functional criteria should be should be used and also in the context of functional testing where there are many Different possible combinations of experiments and parameters that could be used to define functional support for a variant It would be relatively easier for researchers to simply try every possible combination and come up with something that fits and generate some high-impact Resultors as a result of that. I don't necessarily know that the group came up with a clear consensus on how that approach should be addressed whether it whether it came down to Defining a relatively small number of sensible statistical approaches that could be tried whether it says as Gonzalo suggested just setting very very conservative statistical thresholds that allow for the fact that Researchers will try everything and see what works And perhaps that's something that we could discuss before before we finish today if there is any time at all But that that seems like at least in the in the follow-up discussion might be a critical point to consider In terms of outcomes from the meeting or things that we really need to think about it as we put the paper together The first is there was there was acknowledgment of clear and urgent need for improved databases in at least two areas Or at least two broad types of databases The first is is very familiar to everyone here and that is the need for a single Comprehensive and and actually you know really thoroughly annotated database of human disease mutations And that was the the model for that was articulated extremely well by the known variants group How that how that might be conceptualized and put together? David Goldstein also Highlighted this idea of extending that potentially to some kind of wiki-like database of functional and phenotypic information associated with every base in the human genome So then allowing for the fact that clinical variants are just some subset of that of that overall database That's obviously a much more ambitious scheme, but you can see this being a staged a staged approach The second class of database that was discussed was the idea of a repository of aggregated Sequence data and as well critically as controlled and detailed phenotype data from both healthy controls and also from rare disease families and This this obviously would be ideal in this case If we if there was the ability at least for some fraction of the samples in that database to be able to recontact them for more detailed phenotyping that won't always be possible, but that that would be ideal and I think there was consensus that this Shouldn't really necessarily be two separate databases. I mean I think that will need to be discussed But there would certainly be huge advantages to having a controlled control over the quality in the format of the data That includes it spans both the control set and the patient data set Obviously there's a need for this group as a whole to determine the extent to which these the needs for these met by these databases May or may not already be met by ongoing efforts such as ClinVar or the NHGRI data aggregation project And I think there will certainly be some some of these needs will be met about perhaps not all So there will just be a need to identify exactly what what the space of needs that aren't being met is and Finally an idea that was raised perhaps not discussed at huge length But I think was was pretty compelling was the idea that We do need to come up or at least assess the possibility of producing a consensus pipeline for looking at the probability of pathogenicity for novel variants Or for novel genes observed in a patient We obviously need a system that works across all classes of variation And it's important this be that the system the validity of the system be assessed in a strongly standardized and prospective study Ideally in the same way as as many other medical prediction tools are assessed Heidi presented I think in her presentation a model as to how this could potentially work But she also noted that there is a relatively little consensus between clinical labs as to how these these issues are approached So getting a group of people together who understand each of the pieces that would be needed to Pull such a pipeline together would be phenomenally useful So those are those are the areas that I saw as sort of the key overall Areas there will be individual recommendations and guidelines that emerge from each of the individual working groups But I won't as I said go through those at the moment So I'm not sure exactly The only thing that's bugging me and this is really a question is whether the asymmetry of the compelling statistical evidence versus Experimental because I can imagine compelling experimental data that Should be regarded as a replacement for the lack of compelling statistics and it's stuff like penetrance right where you have a variable Penetrance and so you have in the penetrant case you have a really great Story and then the statistics don't bear it out because actually you know it depends on other things But I and I can't decide I really can't decide if that this asymmetry that statistics should trump Mm-hmm experimental. I mean even saying it Seems like anathema I agree. I mean that that's my bias And so I think it's I felt that there were probably more people leaning in that direction than otherwise But I obviously like to discuss it with the grass So I wanted to comment on this because it sort of returns to the question that I posed in a much more open way earlier on today and I'm also Kind of intrigued by this and it bugs me in perhaps for it for different reasons of course I Sort of on the other side of the table represent this community within a journal journal But there will be other editors who handle other papers in which the predominant part of the story comes from the functional world From the people who don't have the insight into the statistics that everybody or most people at this table have and In the extreme case scenario If the paper or the story is is deemed strong from a functional perspective the weeks that is still get citizens governance gets removed completely Right. So what I think so I keep thinking about it from the perspective that you know, this is an informed group of people but the message Has to go out to a much wider community So that's on my perspective on the same problem I mean, I think that they're certainly I don't think we mean to imply that you know functional work is Not publishable even in the highest impact journals I think just people often put the genetics the human genetics in as a way of saying here This really proves that this is the gene that's responsible for this phenotype in humans And I think that's what can't be if that part of it is weak then that's what I think people are arguing that the functional data can't Be used to buttress that part of it. So so yeah, I think it's critical to I mean here We're talking about in the context of a study that is saying we believe that this variant is causal in this disease And so I guess there will be some cases where you can come up with really compelling functional support But in most cases you really need that statistical data to show that initial association between what happens if there's genomic backgrounds that allow high penetrance in some cases and no penetrance and other even in the case where the allele is 50-50 in cases and controls, but all the cases have Unmeasured environmental or genomic background and it really is causative in those guys With an asterisk that there's a lot of other things that also have to be checked And that's why I'm not sure because maybe that asterisk means that you can fall back on the statistics If you have enough of the conditional variables all measured So it all depends about claims. I mean if you appropriately caveat that then perhaps that is perfectly sensible 50-50 in the cases and the controls But all the other you know, there's a bunch of other dependent variables that determine who has the outcome then Those are the those other things are the things that need to be measured and associated I mean the 50-50 thing isn't causing anything, but they I just come back to resonate with the point that Joel made is that there's a distinct difference between two types of papers one is we started with a patient with a severe phenotype Where we started with a disease that we're focused on we screened the genome and we're trying to prove that this is the gene for that Disease that's one type of paper that needs to rely on statistics. There's no question about it There is another type of paper that you know groups that have a keen interest in a certain piece of functional biology or a certain gene from a functional standpoint Sometimes are wrapping in some human data in an effort to make you know their Functional you know story or their gene seem more exciting I think this is where a lot of people get into trouble and where it's a little harder to say You know there could be a lot of stories And a lot of functional data and models that are actually very much worth Publishing because they enlighten us on the biology of certain genes and so forth and the human genetics may not be the Leading part of that but we can't also allow it to sort of just be used as a throwaway Just to sort of elevate the profile and I think that's where the difficulty comes in and I apologize I do have to be I just want to say that 50-50 is Interesting if it's a necessary but not sufficient variation, but right so if it's necessary for it needs to be an excess I mean, I just don't understand the model that The other is static model where you have to have this Finish should become more common, but for it would have to be more common Except in the in the in the I mean there is this strange model in epistasis theory that allows things to be 50-50 If there are other factors which promote in the presence of that allele and protect in the presence of the same allele and other So but you know one thing that I did like is David Goldstein's suggestion, you know having some set of Some statement or some language that you were you say, you know, how compelling is the Cisical evidence, you know, and you don't have to say in every paper that the statistical evidence Definitely implicates this variant or this gene. You could say this physical evidence Makes this gene interesting or it you know says this gene is where you further consideration You could have different levels and then you couldn't you couldn't then build on that and do other things But it but it is nice to be able to separate, you know, and and that and that actually Let's people do whatever they want. You could still publish a paper that where the Cisical evidence Is completely not compelling. You you just have you just have some Phrasing to describe what it is, you know, and I'm sure within the creative set of people we have here We could have a phrase where you know, say it's not compelling at all Actually sounds not too bad and someone wouldn't feel too offended to write in their own paper So we can devise our secret code for the labeling variants I would just urge and since everyone's here as it not be a secret code I mean if people want to actually create a bar for the word causal or create a bar for a particular word that they Actually define it because I mean having an actual practical definition is really useful or something I mean, I think, you know, that's that's actually where the problems all lie I mean where one person thinks suggestive means this another person thinks associated means this I think if you things can be defined and a lot of them disappear Yes Second to that I One very useful thing that might come out of this paper might be some sort of like mini glossary of This is, you know causal comes with certain consequences in terms of burden of proof And it needs to be clear where that proof is coming from in your mind and and damaging means this and deleterious You know, so you could imagine just trying to lay out obviously it can't be a comprehensive sort of listing But some sort of concise but useful list of what we what we should expect a certain term to imply And that causal should there's causal for a gene and then there's causal for a variant those are Those two lines of evidence is being separate other comments on this Back up Maybe this is the consensus, but I do or the emerging consensus I do think we should back off that statement a little bit Yeah, I just just I mean Well, it still needs Statistical evidence Including negative experimental results or whatever yeah, so somebody has to turn there I Mean what we're saying is there all there what we heard all day is that there are different kinds of evidence to support significance and that We would like to see papers address each Category of information and any given category may not be absolutely required But it should be said, you know, we have strong functional evidence. We have strong genetic evidence We don't have great statistical evidence, but we don't have the sample or something like that I think it's a statistical evidence is the best evidence right, but there may be situations where Well, I mean there are scenarios where you you can't get the statistical evidence right or you know Paper we were supposed to read. You know, I think those findings were pretty clear, right? It was it was but those those the implication of those two genes were based on one patient, right? Accompanied by some pretty powerful experimental results, right So it's kind of hard to to try and put these two things as You know things that can oppose each other, you know the the biological evidence could be right It could be that if you knock out this gene you get to disease But that doesn't have any, you know, that doesn't say anything about what happens if you have a non-synonymous variant or a particular Nonspecific even premature stop because the premature stuff might not knock out the function. That's key, you know So and likewise even more clear, you know, you could have very clear genetic evidence You could have a series of genetic functional assays that show no function So right say in terms of statistical functional evidence and in terms of separating Statistical evidence at the level of gene or locus versus evidence for individual variant and At least currently the situation is that statistical evidence is obtainable for for a gene or locus and GWAS we have association If we're talking rare variants, we have burden or dispersion tests Collapsic variants together this can bring sufficient statistical evidence However, if we're talking about individual causal variants It's it's completely different problem in rare variants If if the variant is singleton, there's no way to provide any statistical evidence in in GWAS LD complicates fine mapping and identification of causal variants So if we if we'd like to discriminate between causality of variants and causality of genes Then the the value of statistical evidence is very different relative to experimental evidence I was just going to say I kind of feel like if you're going to claim that variant X causes phenotype Y There has to be a bar that we have to live with that's going to produce some false negatives, right? So But at the same time, I believe we should probably be able to conclude that a singleton is causative On the basis of it's not inconsistent with some genetic model And it also was supported by this overwhelming other evidence But I mean, I I think we have to agree that causality is a tough bar and we're not going to get there for a lot of cases But genetics should be the sort of first best Line for calling something causal, right? The highlighted statement To me, I think that needs to be broken apart at least unless it's somewhere else in the document and others We said experimental evidence We want to say that experimental evidence that indicates that a particular variant has a functional consequence is useful That does not necessarily prove that that functional variation causes the phenotype. That's the problem Yeah And so in order to make that leap Then we can use functional evidence or genetic evidence. I mean statistical evidence or genetic evidence or evidence from model systems or whatever I think one way to capture that is to is to try and Describe precisely the proximity of the assay to the phenotype right if you have a patient who has The metabolite upstream of the enzyme is high and the and the metabolite downstream of the enzyme is low and your variant Causes that same exact biochemical effect in vitro. You're very proximate to the defect you're observing in the patient whereas some of these other things like Mutating a transcription factor that only works in a three day window during the first trimester development That's a harder one to connect the dots on and so I think that that proximity thing And then the second thing is I think there's we talked about statistics in two ways Here I thought which was I thought folks were talking about Having some sort of a statistical Way of thinking about functional assays also Right and the specificity of the assay and how likely it is that the assay generates that output as a non-specific effect of Variation versus a specific effect of that variation And so there's statistics in in there and then there's functional versus the statistical genetics Data so those are two different kinds. Absolutely great. So I should have I mean in the interest of time I haven't gone through the points from the individual Groups but here I I think it is an absolutely critical point made by the functional data discussion was that specificity argument And also the the fact that there needs to be a strong emphasis on multiple independent lines of evidence Leading to functionality. Yeah, we have we have susanne ewan and shumil Okay, and I think with the statistical evidence you have to be a little bit careful because you could get some Very magnificent p-value, but that could also just be because of all kinds of bias You introduced into your analysis. So it maybe is even if you have a very significant p-value that Is probably not Is that that's definitely not enough on its own. So you would like to see replication And also some other evidence would be nice. So I think Replication is also quite key although of course in some cases that's difficult. I realize that but we have to realize that Just a very small p-value by itself Does it necessarily mean that really is the p-value? And and I think it's kind of a similar point But I've been in rooms with basic science colleagues having the same discussion where they have this the other way around They say that an association no matter how good the p-value is never causation I mean that's you know, so it doesn't matter how close the association is and until you have mechanistic data from several different places You cannot decide causality. So they would have this actually the other way around, you know So I wanted to comment on combination of Functional and statistical evidence because they they shouldn't necessarily be considered separately And one example may be similarly to how we calibrate computational methods if you have a gene I mean didn't trade and you have several variants or number of variants with a lot of segregation data full statistical support And you know that this particular functional essay gets all of them right as opposed to variants Which do not segregate maybe for n plus first variant you can trust Functional essay even if you do not have enough statistical support another consideration is that papers early papers on burden tests Where multiple variants in the same gene are associated collectively with complex trait? So some papers utilize the following the following approach that you do functional experiment You'll stratify variants based on in vitro experiment for example into neutral and and functional Restrict your statistical test to functional and your pavilion drops significantly. So you combine iterative statistical approach. So first you you use statistics to guide you To the original association and then you show that Putting functional data into statistical tests Provides much more compelling statistical evidence. So they so this idea of combined functional and statistical approach Just gonna say I think we will know what we mean by statistical evidence sitting around this table, but I actually think it's uh Very hard to understand term, particularly for somebody who doesn't live in a Genetics world. So I think we need to actually think about our better Expression for that and I don't want to split hairs, but I just you know Because we have statistical evidence that the functional studies proved something So I think we just be a little bit more clear and actually think about what we What is a better term that will will make sense to a wider audience about a sequence-based Statistical genetic approach or something like that. I'm my brain's a little fried, but I think we need a better term for that I just wanted to answer I do think that it's you know, we could we sort of take two perspectives and coming Sort of formulating things one person one extreme perspective. We'd say well, there's a lot of You know salty and ambiguity and things and we shouldn't have a very strong perspective another Another would be that we should take a very strong spectrum to be very prescriptive and you know, we should do this But I just want to point out having a conversation with magdalene about this is Imagine we take the former position and we don't come out with a very strong position on things What tends to happen is the community just de facto makes these decisions, you know papers will be published You know, they they will set the standard, you know people will refer to them So I think that it's actually intelligent now if you can make us a potentially strong statement to do that Not um, I may Daniel. Did you have any other questions? No, just to say So I will circulate the the full list of guidelines like as soon as possible to the whole group So there is a chance to feed back quickly after the meeting rather than you know So there's not time for everything to evaporate and then I did have one one slide to show on on sort of um Kind of we had talked about publication standards or what to do in terms of publications just to give some some advice to And I think we heard a lot last night and and again today about Basically tightening standards or at least making them far more explicit both what's What the what criteria might be within a journal and also what it is that the author is proposing um I think we heard also that um people should be really mostly describing associations and not causality that they're They're overblown claims for causality and one needs to be very careful I know I spent the first I won't tell you how many years of my career in epidemiology Knocking causality. We almost never could could conclude it and and you know It's almost never that you can't you need a randomized clinical trial. So so you know, it's and even then it's it's questionable um And then there was there was an issue of higher and and clearer thresholds for causality From the authors themselves. So what is it that they are proposing is the uh the threshold that they're using? Why is that and and what specifically there were so that they're very explicit about it? We heard that we wanted to have sort of a structured proof for why the variants Why the authors believe that a variance is causal. So so okay, you're saying that it's causal You know, you've you've defined your threshold. What's the evidence within your your set of experiments that proves that? And then that of course, you know deposit the data on which is based in a queryable database Which we've heard multiple times But in in terms of the answer is what should be in a paper and what should be judged in a paper Does does this sort of capture it? I mean stringent thresholds very explicit You know providing a basically a logical proof Or are there other things that that we need that are not captured here less I'm not 100 percent sure. I agree with titan publication standards I think some journals have very tight standards right now And I think we should say that they need to be stringent But I don't think we should say that all journals need to increase their standards I would argue even there's some Journals may have a propensity to conservatism a few of our very very best journals And No, I don't just think we don't need to be telling people who have Very high standards that's raised them because they may be perfectly appropriate, but they should be high And and we would love it if they were uniform. They never will be of course A much more consistent One being a lot of published things that are not necessarily slam dunks, right? So I guess less consistent than the standard is the consistency of the terminology Right that's and the the layout or whatever it is, right? I think that the claim should be consistent with the content basically Terry there's one thing that is missing up there, which is what mark talked about in his Opening talk, which is correct in the publication record So that's something that we need to address somewhere Um and and I guess I'm not sure how Maybe we won't worry about how we just say that we should there should be some mechanism and the magic happens here When there is that you know the beauty of science beauty of science is not its consistency But its correctability and and and so I think we count on the peer reviewed literature to do that So are you saying that we need to encourage people to Pick away at existing things more or to public when they found that something isn't true to be bold about publishing That or is that I think so yeah And and perhaps that's something that we need to build into the database Right as if something has been um shown to be disproved later or not held up and that Might need to be incorporated into the database Yes I'm just gonna make the point that we should probably try and be clear that this is not Necessarily the same as what you would recommend for a clinician is that you know Obviously they have to use the judgment of the risks of not acting on a On a real variant have to be weighed against the risks of acting on a false positive and that These standards might clear to clarify the terminology, but they don't necessarily give you any sort of clinical guidelines on what you should or should not do in any given Right, and I think we did hear from from David Demick that that while the clinical labs depend on the on the research You know, obviously they have to interpret the research But but we may provide a broader brush that they need to choose from That's right. I think we can as I already mentioned the ACMG guidelines will you know, we'll provide pretty stringent guidance in that front I almost wonder if we could do with an intermediate term I don't know whether like implicate would would would be a good word or something else And we can we can probably argue about that again But it might be nice to have a way in which you can say well, we think this Leads to this disease, but we don't meet the standards of Causality so that there's something there's a punchy single word that you can put in a title that says We've got some degree of proof, but it's not all the way there That is that is a an accepted word that that everyone knows means We've done a lot of work and this is very exciting, but it's not It's not at a point where a slam dunk. Maybe likely candidate or something like that Strong it's not when it's too wimpy. It's not gonna. It's not you don't want it to be too strong though either Yeah, I mean the problem with those things is that You know someone who reads the paper is not going to necessarily think about the extreme care you chose in the exact words There's just like, ah, this is the gene for blah blah blah. So I think it'd be very careful about that Jeff are you arguing for a numerical score that we should we should apply across the board? Having said that these are we're talking about Research standards It might also be worth putting in there that although these standards are designed for research Endeavour they're useful things to think about when interpreting clinical very in the clinical arena And we'll obviously we'll wordsmith this a little bit so that when you see it, it looks a little bit better other Thoughts Any thoughts from the the fourth of status as to whether is this useful to you? Do you need do you need more from us? You always want more from us Let's Change higher and clearer thresholds for causality to high and clear Some some journals actually don't need to raise that bar anymore. I think we're just about finished So so just to let you know in terms of next steps Daniel said what we'll try to do is take at least these summary points And maybe this slide send it out to you, you know get it out to you within the next few days or so At least so you can take a look at it make sure we haven't completely missed the boat And then would hope to have a draft manuscript out to you very very soon Recognizing that that we do have a couple of big meetings coming up after just after this everybody does As well as other things, but we'll do our best to get that done quickly. So we don't forget just in terms of authorship You really have to respond in order for us to meet the criteria for the icmge So so be sure that you do get back to us. We know that all your emails work at least as of today today So if we don't hear from you We're going to have to conclude that you're not interested in participating and we'll we'll acknowledge you But we won't be able to list you on the mouse tent. So please we want everybody to continue to participate and please please do respond Anything from the planning committee Yes, David. So are the slides available? They they are and will be Not immediately so they will all be posted on the nhgr website. I'm not sure exactly when that will happen And I know that the the the videos are available now so you can watch them go by But but getting the getting the slides themselves. We can probably put them on a share point Can you put them? Oh, so so yes Okay, well as as we said at the beginning of the meeting if you don't want to shown don't you know share Don't show it. So so everything everything is it so I think we have all of the slides that we Yes And then we will talk because because that will be a problem Yeah, so okay. So our plan is to post them all very very soon Okay And I would I would like to thank everybody on the planning group who did a fabulous job of working on this the working group chairs Who also did an incredible job and and everybody you know around this table really was involved in these working groups Which was great. Also many thanks to Ian McCrory who kept us all organized and ready to go. So thank you again Okay, I think I think we're sorry. I'd just like to finish by thanking terry and nhgr For actually funding this whole enterprise as well. So it's been very useful and thanks to you all for participating So enthusiastically in a long day. It's been great. Great. All right. Thank you. Safe travels by now