 So, I'm going to talk about statistical analysis and as you'll see that it is really not independent this section of study design and we're really going to revisit many of the things that came up before trying to put a little bit more of a spin on actual statistical testing and I would like to draw a commonality between complex and medelian traits and show that many of the problems that we have from medelian traits also are true for complex traits and that although we can implicate perhaps a gene that it's more difficult sometimes to say about causality of individual variants in particular those variants that are very rare. I think one advantage that we have right now is that we can look at data on a genome level and we're not just have to focus on individual genes and this is also very true for a medelian traits where often people would choose their favorite, they would start choosing their favorite candidate genes within a linkage region and once they thought they found something they would stop. So, now we're at a huge advantage that we can look at in the entire region or the, you know and we don't have to just focus on a particular set of genes, our favorite genes. So, I would first like to start with complex traits. So you know the rare variants for complex traits, they're going to have effect sizes which are large to odd ratios approaching one. I don't think we can be so optimistic as many people were in the very beginning that you know these rare variants are going to have huge effects, we only need a small sample size to detect them. I think that's already extremely clear that these effect sizes are not so large and we do need large sample sizes. So, because we need such large sample sizes it's really very important that we have international consortia that will be able to share data not only on the traits, either qualitative or quantitative traits, but also control data and so that's really going to be the only way that we're going to be able to approach the sample sizes that are large enough in order to detect association. And also additionally it'll be very important to have a publicly available cohorts available for investigators. So, one thing we have to keep in mind that for very rare variants, we will not be able to test individual rare variants even if we have very large, large sample sizes. And so what we have to do in order to detect association is that we have to analyze the rare variants in aggregate. And usually what we're doing as of late is we're aggregating the variants across a region which is usually a gene. Now this is going to be more problematic when we want to look outside of gene regions because it's very difficult to know which rare variants we should aggregate. I don't think that's clear at all. At least I don't have an answer of what we should do when we want to start looking outside of gene regions. This is also a little bit of a problematic approach because people tend to select different types of variants to test and then they will keep on changing the set of variants that they're testing or the tests they will perform and they tend to forget everything they did before. So even though they have a very, even if they have a very small p-value, that p-value is not adjusted for multiple testing. So that's also a problem that we run across. So even if we do perform, you know, do the right thing when we're performing these tests, adjust for multiple testing, you know, require, you know, a very small p-value and we replicate the region, we still have the problem that because we analyze these rare variants in aggregate and we may be, we are able to say something that the gene is involved for the very rare variants is, you really can't say if they're causal or not because within this aggregate test, of course, you're going to have causal and non-causal variants. Also we have to be very careful when we're performing our tests. One particular problem that's probably much more important for rare variants because we see a very, very large difference even with very closely related populations. And even when before we started looking at the next generation sequencing and analyzing rare variants, we already had a very strong clue that rare variants would be very different in different populations even that were very similar. If we look at the Ashkenazi Jewish population, the spectrum of the rare variants is quite different from other European populations. So even if we have very neighboring populations and we're analyzing them, this is still a problem. For today, we're basically using the exact same methods to control for population substructure that was used for common variants. And it's really not clear if that's adequate for rare variants. I would say we don't know. So some of the ways that we can avoid false positive, which I mentioned a moment ago, but I'll revisit, is we need to avoid, if we do perform multiple testing, we do need to control for it. And we need to have a statistical test that are significant. But those significant levels still have to be determined. I don't think we really know what kind of level of significance we need yet. And so that's something that really needs to be investigated. And it's also necessary that the findings are replicated in an impended sample. Here we do have some questions is how are we going to replicate those findings in an independent sample? One thing you have to be aware of is you would probably have to make sure that you sample your new sample within the same population. So because if you go to a neighboring population, the spectrum of rare variants can be quite different. And so what if you look at a neighboring population and you resequence that gene region, and you're looking at a new set of variants? So did you really replicate it? What is the level of significance? Can you find if you didn't replicate? Is it because the spectrum of rare variants is different or they're really that was a false positive finding in your first study? So these are all new types of caveats that we have with complex traits for rare variants that weren't such a problem with the common variants. So although I think that we can have statistical evidence for regions or genes or higher frequency rare variants. We still have this problem that we cannot have statistical evidence to determine the causality of very rare variants by testing individual variants. Because you just won't have the power to detect association. So how about Mendelian traits? So I think that we really need more evidence than a single variant in an affected individual and that variant is not observed in databases. Also if we have a small family segregating a rare variant, it is also not sufficient evidence that the variant is causal. You're going to have many variants just by chance segregating in these small families. So that's not a very surprising finding at all. So one thing I think is really very helpful is if we go back a decade or two and we use linkage analysis to implicate a region. And we can implicate a region either using large families or multiple small families. And we don't really have to reinvent the wheel here. How to go about doing this is in the literature and so we can just go back to these old methods to establish linkage to a region. So I think what's very useful in implicating a gene is having multiple families with variants within the same gene. And these variants can either be the same variant or different variants. And it's also helpful if you see that these variants are absent, are only in very low frequencies in controls. Of course, if you have reduced penetrance, you would expect to see these variants also in controls. And you can use statistical tests and perform them to show that there is actual differences in the frequency of these variants between cases and control. So this can provide evidence that a gene is involved in disease etiology. However, we still are back to the same problem. What happens if we only see that variant in a single family? That's still not proof that that variant is involved in the disease etiology. And if you know the region of linkage, it's not surprising at all that that variant segregates with a disease within the family. It's going to be on the same haplotype with a disease variant. So of course it's going to segregate, I would say. And so that's not really evidence that the variant is involved in disease etiology. So we can also look at, in many cases we do have these single individuals. Maybe we know they have a family history of disease, but we don't have any other family members available for study. So we can also study these Mendelian traits and perform association analysis using these individual cases. And we can go about looking for association by using these rare variant aggregate association tests that were developed for complex traits. However, this can be very problematic for diseases with locus heterogeneity. So if you have high levels of locus heterogeneity, so there'll be many of these individuals that they don't have variants in the same gene or very reduced penetrance. And here you would need a very large sample size. But however, if you're looking at a rare Mendelian, it's very difficult to get these very large sample size. So that is a very big caveat for this particular approach. It's going to work in some cases, but not always. So what is some additional evidence we could use? So I would say seeing variants at higher frequencies and controls than cases can help rule out that a variant is causal. However, we can't use the opposite logic and say that if a variant is not seen in controls, it is evidence of causality. And we have to remember that due to recent population groups that there are many extremely rare variants, which in some cases are private. So even if you had an extremely large database even of ethnically matched controls and you don't find that variant in your database, that is not evidence of causality. So how about de novo variants? So de novo variants, especially those that are non-synonymous, are very rare. So but just by chance, if we look at enough trios, we're going to have variants that fall within genes that it's easy to build a story about that particular gene being involved in disease ideology. So I don't think that when you have a de novo event, that evidence on its own is enough to say that a variant is causal. So how about experimental support? So I think that experimental support is complementary to statistical support for phenotype. And we definitely can say something about the causality, sorry, of that particular variant. But we have to remember just because a variant is functional, it does not prove causality. And it's really unfortunate because for many years, we've used the word functional and causal as though they mean exactly the same thing, which they clearly don't. You know, a causal variant has to be functional, but a functional variant does not have to be causal. So how about some statistical tests for very rare variants? So one thing we could look at is that if we have a class of variants that are only seen in cases but not in controls, that could provide additional evidence of causality and also could be held to high statistical standards. However, in many cases, this is not going to work. One thing that we have to be very careful that when we perform our statistical tests is often we peek at our data and we say, well, if we just test for this, a difference of this and the other thing, then we'll find something. So if you didn't take a peek of your data, that might not be the obvious test to perform. So you are actually performing multiple testing without actually doing the test. So we have to be very careful about that. So here I'd like to, you know, of course there's many other discussion questions, but here's a few to get us going. Great. Thank you very much, Suzanne. So yes, please. One thing I'm very curious about is increasingly with, you know, whole genome sequencing, we're gonna be seeing that a lot of the variants are more complex events like CNVs and so forth. And I'm really interested in how people are gonna think about the statistics of these. I mean, how they're gonna aggregate them together or how similar they have to actually be to be considered one thing and so forth. Just curious if any thoughts on that? So that may not be a statistical question, perhaps more of a structural question. Those of us who do structure wanna comment on that? Nobody does. Oh, come on. I think it's very problematic. And also right now we still have the problem of just accurately calling in indels and larger copy number variants. So I think that is going to also be a problem of how we collapse them. Yes, please. I'll just, I think it's a problem that's gonna occupy us for the next four or five years for sure. But I think the best first pass approach is just to translate all forms of variation into some sort of common space, which is what is the predicted functional impact of that variant on a gene. So Daniel's already, his work and other work by the loss of function group in the 1000 Genomes Project has already, you know, started with the, you know, maybe the easiest case scenario, which is, you know, mapping all types of variation, all classes of variation into that loss of function space. Now, less harm for less pathogenic types of variants will be harder to interpret in terms of indels and CNVs. But that'd be my first pass suggestion. Dude. So I think there are a lot of technical challenges. There are technical challenges, you know, for example, identifying whether a set of CNVs includes some subset that are in fact the same or not the same. So how do you group them? There are technical challenges, even in what Don, you're suggesting for sure in trying to assign, you know, whether variants are in the same class or not. That's usually, you know, very difficult to do. There's all sorts of technical challenges, but maybe our biggest problem is actually just intellectual integrity, I would say, because even within a class of variants, you know, there isn't a consistent effort right now to correct for all the tests that would be represented within that class of variants. And if we only did that, like, you know, de novo mutations, you know, they're always there. And so the kind of presumption that when you get a de novo mutation, it's causal, it's patently absurd. And so here we are in 2012 where this is happening. But it's actually a trivial exercise to say, look, okay, we're gonna have to account for what the null distribution is of de novo mutations and ask whether we have an excess beyond that. And, you know, Mark has very rightly emphasized that there's, you know, often some little signal in data, but not maybe even a significant excess beyond null expectation, even when you know or strongly suspect some of the de novo mutations are causal. So I guess what I would say is that maybe the first step is simply to require clarity about what the hypothesis is, you know, that I'm gonna test all variants of class X treated as a group with some kind of a prior hypothesis and that clearly is justified. So the idea that we do away with that doesn't make any sense because clearly loss of function mutations are different from ones that are not assigned a function and so on. But within identified classes of variants, we have to insist on absolutely strict rigor in doing the statistics. I would say that's probably the most important first step we could take. We have, Jeff, did you have a comment or, and then Shamil? Just a brief one about, specifically about structural variation and the change in technology. So at Sanger we're involved in a project called deciphering developmental disorders where as a collaboration with the regional genetic services they submit undiagnosed cases, we do both a very high resolution or ACGH and exome sequencing on the trios. And we just have a huge problem with essentially taking, because the ACGH we designed has dense coverage in all the exons, so we can tend to find quite small insertions and deletions of one or two exons or even an intron say. And the problem we have is, you know, if you think about kind of more traditional, very large copy number events in developmental disorders, the functional question you're asking is, you know, okay, it deletes like eight genes. And so obviously that is having a function on those genes. And what we're dealing with now is obviously much more difficult to assign any causality, especially because we don't have breakpoints in any sense with the ACGH data. And we also have this thing of, we're building up a pretty decent and detailed map of copy number variants in controls, both with published data like stuff that Don and 1000 Genomes have done. And also we've run our specific technologies on control populations in the UK. But you get this thing of, you know, the boundaries of the variants are very fuzzy and they have varying overlap. And how do you decide? This thing has been seen in 10% of the population but overlaps by like 46%. And it's kind of a completely, at a moment, unanswerable question about how do you interpret that? Because we make up some rules of thumb basically but I think it's a massively non-obvious problem and it might be not even possible until we sort of do these things entirely with sequencing and can routinely get breakpoint definition. Great. Shamil? So I have a very technical statistical comment to what David called Let's Test Together Variants in Functional Class X. And this is how most groups do the analysis. At the same time, if you read methods literature, there are many methods suggesting to group all variants together and use functional weights. Some sort of probabilities that we believe that this class of variants is more functional than another one, weight every variant and have aggregate test without testing this functional category, then do a separate test on another functional category and then do a third functional test on yet another functional category. Surprisingly, in real studies, as soon as you're on conference call and decide discuss real study, all this literature is available that people are not using this weighted test. So people think in terms of this functional category, this functional category and this functional category, let's do aggregate analysis separately. And I'm not sure this is the optimal way to go forward. Before I call on Daniel, does somebody want to respond specifically to Shamil's point? I think it's a great point, but I actually do think it does depend the context. Like in some diseases, there's a clarity that there's an important contribution of de novo mutations just from the pattern of disease presentation. And in that case, I think it's entirely appropriate to have a focus study on de novo mutations. And so there are circumstances under which you pluck out a class of mutations and you say, okay, I'm gonna analyze that class of mutations and ask for evidence within it. So I think that there are contexts in which that is appropriate. Although obviously in other settings, looking for epilepsy risk factors in general, the schizophrenia risk factors in general, what you describe is perfectly true. People say, okay, look, I look for CNVs of this sort and here's my p-value for those CNVs and then you look for CNVs of a different sort and then you move on from CNVs to, you know, certainly that is happening. We had Daniel, then Beth, then Mark. No, no, sorry. I just wanted to say that. I mean, I think one issue with this functional way is it tends to very much overemphasize genes and things we know. I mean, anytime we start to think about, you know, regions of the genome principally outside of genes that we don't know as much about, you know, they always just get down weighted and I think, you know, in a sense, maybe unfairly or incorrectly or something. I think we have Daniel, then Beth. I wanted to just say a few words about how we use the New England Journal of Medicine protocols, which we, and our requirements of authors of reports of clinical trials. And this kind of partly goes to your point, David, about pre-specifying study design. So we require, in order to consider a clinical trial, in order for a paper describing that to actually arrive on an editor's desk, that it has been previously registered prior to the enrollment of the first patient in that trial. And if it's not, then the paper is not considered. It doesn't go out for review. And when we publish these trials, very often we'll publish the protocols alongside them. And before a paper goes to press, in fact, before we really seriously consider accepting it, we ensure that there's accurate correlation between the trial and the report. I do think some kind of documented pre-specification could be helpful. The extent to which the community wants to model this on that kind of system would require consideration. There are also other ways, I think, journals can consider helping to ensure that study design is clearly described. And these would include codified descriptions of clinical phenotype, criteria for labeling variants as causative or of unknown significance, implementing easy mechanisms for corrections of previous claims of causality that have subsequently been determined not to be correct. And I think the easiest way for editors to do this is to have dedicated sections and methods and appendices for these types of information. No, that's an excellent point. In fact, if you wanted to show those so that we could get them all down, you can, I think you can ask him to link to your machine. If you had them on your machine, you may not, you may have them just in your brain. We'll get them from you later, Daniel. So I think both Suzanne and David Goldstein, sorry, mentioned that these opportunities were massively parallel post hoc analysis. And I think it obviously is a critical issue in this complex space at the moment. In the GWAS era, as Mark mentioned yesterday, there were clear standards established for what is genome-wide significance, how can that be defined? And in fact, one of the reasons that GWAS have been so successful is because everyone does a very, you know, tightly standardized approach. How close are we to being able to define that though in the rare variant setting? Like, is there any move towards consensus? Is there a process for building consensus? Can we reach a GWAS-like stringency of approach? I think we will, but I think, you know, still we don't really know what we're doing as far as analyzing these rare variants if we want to be truthful about it and we're trying different things. But I think we will reach a consensus, not only for analysis, but also for data quality controls and another issue which I didn't bring up, which is also extremely important before even beginning to analyze the data. We do have Greg, but I might just point out that a lot of that clear consensus that we had in the GWAS era came out of a meeting very much like this. So we're hoping before you all leave at five today that we'll have that consensus. We may not, but certainly in the manuscript drafts that go around, we need to come to it because I think this is our chance. On that point, I would just be a little bit pessimistic in the sense that we really don't know, so part of that depends upon the whole idea of exome sequencing is sort of an implicit assumption that things that disrupt proteins are a priori enriched for disease causality and we don't know the extent to which that's true, but getting that quantification right would be important for sort of calibrating the false discovery rates of p-values that you get on the back end. And we're a long way from knowing what a promoter annotation is worth or what a non-synonymous annotation is worth or a hypersensitive site or any of these things in terms of the back end, how much it enriches for causality versus not causality. So it's unrealistic, I think, to expect to get a p-value threshold, for example, that we can apply. So David has a comment, but I might just stimulate our colleagues here from ENCODE who are obviously interested in things other than protein coding genes. I realize it's still 6.30 in the morning on the West Coast, but if you could have another cup of coffee and join in the conversation. I was just gonna say, I also am pretty pessimistic about our being able to use the sort of basic model in GWAS to figure out sort of the right way to analyze sequence data. I think what we can do is use that experience to identify things that are clearly wrong. And so I think we can really find bad practice, but I don't think that we can define good practice with that example, and the basic reason for that is that I really, I myself, and there may be variation of view on this, I think here, but I myself have an orientation that we can't escape the biology when we're interpreting sequence data. And for that reason, it'll be very difficult for us to establish a standardized statistical framework. Whereas in GWAS, the biology was entirely escapable because the appropriate way to deal with the variants that were being interrogated was to treat them all the same. And that fundamental distinction means that we're not gonna be able to establish best practice in the same way. So we have Shamil and Nancy. Let me just respond to that point. I wasn't suggesting that we take GWAS as the model. What I was saying was five years ago, we were having the same conversation. Oh my God, what are we gonna do? Nobody knows how to do this, et cetera, et cetera. And we came out with some things. Okay, Shamil. So I think there are two different problems here. So we're saying it's difficult. So one problem is implicating genes through collective analysis of rare variants in this gene complex trait. And the second problem is finding individual variants. And implicating genes, I don't see why following GWAS paradigm is impossible. We have either burden tests or over dispersion tests. We're on them. I think there is reasonable agreement what constitutes exon-wide p-value. And even if we see some deflation, you can do permutation on the whole exon. So statistically, the picture is reasonably clear. As soon as we're talking about function of individual variants, then I think I agree with Suzanne. There is no way in purely statistical realm to tell which of these variants exactly is functional and which variant is not functional. So that's a much harder problem. But for implicating genes, I think we can simply follow GWAS paradigm. So on Greg's point, we're only looking at protein coding genes and all of the story. And I think many of us in the room, certainly I do believe that majority of functionally significant nucleotides lie outside of protein coding genes. It's just, I have no idea how to build burden or dispersion test to do similar approach in non-coding DNA. And I just think we're lacking ideas how to proceed here. Nancy and then Mark. I wanna come back to a point that Suzanne made very well, but I think we need to keep in mind. We're used in statistics to recognizing missing data problems and trying to quantify what that means. Let's for a moment think that we actually know and understand everything in the genome that has functional consequence. That it's not a mystery anymore what variants can have an effect on genes regulation, on a gene's function. And that we even know about all of the other functional units. It's still a big problem. We have to recognize that for any given phenotype we don't know yet what functional variants can actually be contributory. And it will be different for different phenotypes. We will often get positive results from burden tests for two different diseases where the contributory set of functional variants may be non-overlapping. So it's a hard problem even if all of the things we don't actually know today, if we actually knew all of those things. And so I think we also have to be careful about our language with respect to the words functional versus something like contributory. A lot of the variants that will be tested or that we'll use waiting factors for. I mean we really are waiting using information that we know about. These variants do have function with respect to the gene but they still may not be contributory to any phenotypes that we're studying. And so language and precision also matters in particularly in publication context. But I think even in accurate communications with each other. So I wanna agree very strongly with Shamil. I think that we, you know, what's not possible for us to do is to come away with a number like we did in GWA space. And we all recognize this is a much more complex problem and there are, as David suggests, a number of different types of studies that can be embedded in a genome or exome sequencing study. We don't yet know how to read and wait the non-coding component of the genome as we move from whole exome to whole genome. But at the same time, I think, you know, there is no reason why we can't outline what are the statistical principles into which we will analyze this data and deliver the concepts to ourselves, our community, to journals as to how one should approach the problem of evaluating things. As these, as David says most of the time, this is just eliminating stuff that's patently wrong. But I think at the same time, the genome is finite. I'm very comfortable with our, you know, with having access to very large exome sequencing data sets that we now begin to understand the content, the distribution in terms of frequency and each different functional category of how much variation there is in an exome and any number of exomes and can develop statistical inferences for particular questions that we might wanna ask from that. There's not just one question, there's not just one analytic approach and I think that's right and as Jeff pointed out, depending on what the underlying architecture is in a particular gene and a particular disease, the best approach is gonna be different. So it is much harder, but I think there are common principles that, you know, as Shamil has outlined it, that we can all aim for and as we learn how to interpret the rest of the genome, we can modify those and so forth. Jeff. I just wanted to make an observation about the, if you sort of read the recent, you know, last, I don't know, a couple of years or so, literature in journals like Janepi and Bioinformatics and there's a huge flourishing of different statistical approaches to combining and collapsing and so forth, rare variants in sequence data. And it kind of reminds me of in the sort of pre-GWAS years, there's a huge flourishing of haplotype-based association tests and, you know, very complex and esoteric epistasis tests. And as it happened, the problem at that point wasn't that we needed a different statistical test, it was that the dataset wasn't really there yet. And in fact, almost all of the GWAS findings were based on statistics that were invented eight years ago. And I don't necessarily think that's going to be the case now, but I do sort of have this sense that I think it's, I think that kind of, that process of going through exploring the different statistical ideas is a useful one, but that my guess would be that eventually we'll get to the point where there's a kind of agreed way to do, I mean, as Mark said, there are various different analytical questions, but that will eventually settle on a reasonably straightforward statistical way to ask each of those questions. And it's going to be a matter of accumulating the right types of data to be able to start, because right now there haven't been many published actual associations from any of these 200 methods. Yes, Jay? So I was just going to, I guess it seems like when, if one were to publish a rare variant in complex disease study, I think one would presumably or review it or acquire one, go through the exercise of trying to calculate a p-value for those things. But from Mendelian disorders, I think people often or maybe even typically, or nearly always don't calculate a p-value, let's say, right? You say I have six to eight of my pro bands of de novo mutations, right, that's a clear result, but no one goes through the exercise. I don't know how people, I mean, maybe the guidance should be that people always calculate a p-value. So even like, there's sort of vague standard of a second family, right? Second family, and you're done, but what does that actually mean, right? I mean, second family in a gene that has a high mutation, right, you know? So, yeah, but. Right, so one, again, drawing the analogy to where we were with GWAS. So there was actually a pretty big debate also between people who sort of wanted to set a universal threshold, and then there was kind of a Bayesian type of argument, and the universal threshold essentially became the standard because it turned out that probably was more or less right to treat a lot of the variants the same, that we couldn't identify a class of variants which were more likely to be associated. It's not totally clear we're in the exact same situation, so we may actually end up, you know, more of a Bayesian, until we at least have more experience under our belt, we may end up trying to be at least in the more of a Bayesian world. The problem is, of course, without the experience, it's very difficult to write down exactly what those priors are, but I think that is, it was for a while, I think, a useful framework to think about it. In some sense, the calculating P values from multiple families, that you can also think of that in kind of a Bayesian context. You have some prior probability that the, and even for known variants, I think in a lot of the things that will come up with that may be a useful context to say, here's a prior probability, and then what's the, you know, what was the probability of observing that data under the model of this gene as causal or it's, you know, random chance, and then, you know, what's the posterior probability that this is actually valid, either the gene is associated or the variant is actually contributory. I think it's a critical point that's come up a couple of times now is this idea of, you know, some sort of weighting approach. But I haven't yet heard, like, how do we actually go about empirically calculating those priors for the different classes of variants? What is exactly the right approach to get those weights? How should we approach this? I think that's extremely difficult. First, because, you know, if you look at just, first of all, you would just have to work on functionality and the probabilities would have to be working on functionality, so, and so there you're kind of equating functionality means causality, but we're even in a worse position than that because a lot of these priors for functionality are not all that good. We could also think of a variable selection approach which could help us implicate a specific gene. Again, we would still have the problem of saying something about individual variants even if we were just selecting out a subset of the variants. Of course, then we would pay the price for multiple testing because if you're using variable selection, you have to consider that you didn't perform one test, but that might be another approach. And, you know, Shamil might want to comment on that some more because he's done some work on that. So I'm happy to comment here. So of course, what I think we do not know about functionality of rare variants and complex traits is whether those are same types of variants we see in Mendelian traits because I think part of the literature suggests that all these are heterocyclic carriers or something like this and some literature is assuming that these are functional hypomorphs. So what you see if there is a gene involved in Mendelian trait and you see variants in, so for example in hypercholesterol in me and then you see impact on general variation in cholesterol, people would say these are hypomorphs, these are not the same exact mutations that cause Mendelian phenotypes. However, we can take AGMD and we all heard how suboptimal AGMD is where we know mutations involved in Mendelian traits and Heidi may come up with a better database. And what we saw in Heidi's slide yesterday, there is a gene where we know how many nonsense variants versus how many missence variant in this particular gene were confidently implicated in Mendelian phenotype. And in some patient's sense, this would give you prior for example, what's relative weight of nonsense variant versus missence variant in this gene. So there are, I agree with Suzanne that the problem is very hard, but there are some potential ways to train system and learn our priors for various functional categories. Thanks, we have Greg and Ewan. Yeah, I was just gonna say and follow up to Shamil. A lot of this is gonna have to driven empirically, right? We just sort of collect knowledge about variants that we believe are causal, variants that we believe are not causal and everything in between and try and learn what their properties are. And so later on we'll talk about this in a few of our de-annotation working group slides. And the reality is that we're gonna have to live in that sort of empirical world where it's gonna be based on permutations and simulations because we're not gonna have sort of true biologically defined and quantified weights for a while yet, but we can do things empirically to get useful measures if nothing else. So a related point from the other direction that touches on a little bit of a few of the ideas to do with Bayesian analysis. Is the unit of collapse, we haven't really talked about this here, we just sort of assumed that gene is a unit of collapse and we talked about the thousands of collapsing tests. There are, I think it's a very difficult problem, but maybe a comment on at what level do we collapse? We collapse at exon, we collapse at transcript, we collapse at gene, we collapse at pathway. And I want to talk about pathways, we're talking about years of biology that have potentially fairly securely tied one gene's effect, functional effect to another. So do we leverage that or do we take more of a GWAS approach and step back and be agnostic? Yeah, I just, I do want to add to that comment. I do think people sometimes also don't appreciate the transcript complexity of the average gene. I mean, the average gene really is many, many different transcripts. And some of them are only very rarely seen and some are very commonly seen. And you know, you really do, I think people really have to think how to weight all the transcripts properly and how to exclude some and so forth to really think about how to aggregate things properly for a gene. I think that the question, I don't think it's going to be possible to come up with a sensible approach to weighting. And it kind of obviously will vary from phenotype to phenotype. But I think it'll even vary within the phenotype from gene to gene in that gene A may be that you need some loss of function mutation to confer risk. And so the most powerful weight there will be essentially to put all the weight on lots of function mutations. And gene B, it might be that essentially it's something to do with differential, different transcripts. And so it's a set of splicing or expression regulating SNPs and that the weight should go there. And so I think it's a realistic goal to say, oh, we can come up with the right way to weight different coding and functional variation. I think it's going to have to be a kind of collection of things that are appropriate in different situations. So let Mark respond to that and then Greg. Well, I just do want to respond. I do think there is a natural way though to think about just the entire genome uniformly and that is in the framework of selection. I mean, the different bits of the genome are conserved to different degrees and are under different degrees of selection. And that's a fairly uniform thing. And you can use that. I mean, throughout the entire genome is kind of a universal standard to some degree about how important something is. So we have Greg and then Godzilla. Right, so just a comment here that in terms of trying to predict from a phenotype what underlying functionality should be enriched not a right. I really think that's unrealistic. Should diabetes be more relevant, more sensitive to splicing versus mis-sense versus not. You know, that's really difficult but there are sort of different modes of annotation. So the comment was brought up about if you suspect a de novo origin for a phenotype. So that's a genetic argument that you can build from family history and there's a, you can build a better prior that sort of better weights that but from the point of view of function, CNVs versus mis-sense, it's really hard to do that from the level of the disease point of view. There's no way to predict that. Yeah, so you know, so this idea of predicting weights, I actually am very close to Jeff's point that it's very hard. You know, a bit of data that's out there is, you know, some of my colleagues in the exome sequencing project, I think looking at LDL cholesterol levels, actually, you know, there's a number of genes where rare variants have been implicated and so they actually apply the series of burn tests and say, you know, which of those signals can we pick up? And you know, and the optimal test was very different for each one of those known loci. Turns out that, for example, in LDLR for the LDL phenotype, almost all the variants are important. They're singletons. They're often, you know, premature stops and very obvious functional consequence. If you go look, for example, at PCSK9, there's a variety of relatively frequent variants there that are important, you know, and so the optimal way to identify one or the other is different even though it's the same phenotype and these are very well-established genes. If I was gonna try and speak as a statistician, you asked me, you know, what's the way in this context to be, you know, to be strict and avoid making, you know, false discoveries is you set the threshold assuming that people have tried, you know, all possible weights and you know, it's gonna be a more stringent threshold than if you use any one set of weights. But it's very hard and at least where we are now to believe that, you know, anyone started with a single set of weights and did just that and took their study through to completion and also that single set that they started off with didn't have a result, they wouldn't have modified, you know, so I think what you want is to figure out if you used all possible weights, figure out what would be the threshold. You know, there will be a threshold. It'll be more stringent than if you use no weightings. I mean, are there effectively infinite ways to perform this analysis though? Wait a minute. Is this the same as with the variable threshold test? You know, in fact, those tests are actually very helpful in thinking about, should I decide rare variants are below 5% or below 1% or just a singleton? You know, you do try all the possible cut points and you know, it actually is just a slightly more stringent threshold but much easier to interpret than looking at all the tests individually. Yeah. No, I think I've struggled with the idea of how to, you know, employ weightings because well, in spirit I would like to, you know, embrace Chamele's model wholeheartedly. You know, when, you know, looking through the sequencing we're doing mostly focused on, you know, childhood neurodevelopmental, you know, neuropsychiatric outcomes, you know, we know the most, you know, severe genetic cases that we look for involve, you know, a host of genes for which the mechanism of action is clearly de novo heterozygous gain of function, missense mutations and we have another whole host where the mechanism of action is primarily loss of function either in a recessive or a haploinsufficient fashion. And, you know, a single statistic that waits all the, you know, it can't be optimal in that setting when we're looking for so many different genes that are gonna have different modes of action but maybe, you know, it could be that a higher level set of analyses that has, you know, different categories in it that we might assign weights to collectively might be able to achieve this but, you know, it doesn't bother me very much at this point to have a small set of sensible categories that we explore and we might learn how to do this better and I think we'll obviously have to as we move to the non-coding variation where there's not likely to be such clear cut in discrete categories as gain of function and loss of function but it's something we wrestle with right now. So, small comment to Gonzalez point is that when you're using several tests and this is similar, I agree, to very well threshold test but what's helping you is that this tests are highly dependent. So multiple test panels that you pay is much less than in true number of independent tests because different weighting, different frequency cutoffs would produce highly dependent results. So these are not truly independent tests and when we're talking about this multiple testing it's not as scary as may have sound. Yeah, just on this, this part of the conversation is I think incredibly important and I, just to state my bias, I'm a huge fan of weights and I'm a huge believer that Bayesian reasoning is gonna be mandatory here. And in response to your question which I think was very pertinent about is there an infinity of models? In principle, there is because you can change every weight along a continuous period but if you write down your model and if you get a hit you can do a sensitivity analysis to say under what range of changes to my priors do I still get the same result? And then a reviewer or a skeptic could look at that and say, okay, is that range of priors that are all in an equivalence class reasonable? And if it is then maybe you've made some progress in terms of getting the hits. I just like to say, if you're not Bayesian which maybe you don't have to be but then looking at one of these studies without priors or without weights is the equivalent of saying all previous biology is irrelevant and I want biology to bang me over the head with this signal otherwise I'm not gonna believe it and sometimes biology will do that but in many interesting cases biology won't do that. And so you need to use previous biological knowledge as the context for looking at the data. Although it is worth noting, I mean, GWAS was successful despite not requiring yeah, requiring biology to bang people over the head and in fact, you know, we're much more successful in the candidate genre where we did weight in some way. And in fact, biology didn't bang us over the head. I mean, it was 80% were in non-coding regions. Now that's a biologic statement but you know, the first two years it was like, well these must be wrong, you know. You have to consider the possibility that what we know now is negligible compared to what and in that case, then you're very comfortable with throwing away, when micro-race came out David Boston said he was willing to throw away the textbook of biology and just base his new understanding of biology on what the micro-race told him. Just gonna follow up the rest. You know, if we weren't Bayesian, we wouldn't bother exome sequencing, right? The whole enterprise there is predicated upon a prior assumption about where function lies. So I would add to Bayesian camp that where we're talking that we don't know which variants are important for particular disease, that's true but that's the whole idea of Bayesian statistics. You put weight on based on what you know and then you learn from data. So I guess we could ask, are we gathering the data we need to learn from the data? And I guess what we've been hearing is that there is so much that goes unreported that could be interpreted or could be useful that we may not be capturing that. Oh, yeah. No, I mean being the person who opened this particular can of worms. Yeah, I mean I think that, you know, what I was, the point I was trying to make is that first of all that in GWAS actually it really was not the case that being a Bayesian I think particularly helped. And I don't know how much is going to help here and I think my main point was that it might but I don't think that we yet have, I mean we're getting a little bit of a sense of whether it might or not in some of the cases that Gonzalo and Mark alluded to would suggest that if you group all genes together maybe Bayesian approaches might not even be that helpful. But I agree, I think we probably need to get a better empirical sense to before we decide whether these sorts of approaches are going to be advantageous and I would say even though I brought this up that in the absence of knowing whether that's, whether they're advantageous kind of the safest approach for not making false discoveries is to assume that you don't really know that much and use a sort of uniform statistical approach. You know again Bayesian approaches can actually incorporate a range of priors and you may actually be able to have a principled universal approach that effectively incorporates some unknown weighting. The place where I think it might actually be more helpful is when you're looking at an individual gene and I guess we'll get to that section later. I just was going to amplify in Greg's comment that essentially when you're doing an excellent experiment you essentially have a prior. I mean likewise, I think people are not taking into account structural variance and whatnot. There's essentially a prior there that they're not as relevant and so forth and you know. So just to, if I could just add a point to that which is that means that we have to be really careful about being sensitive to our implicit assumptions when we think we're being unbiased and just letting biology hit us over the head but we haven't looked at 97% of the genome because it's not exome. So that would be a very good thing for us to capture in a manuscript is that it's really important to be extremely rigorous about what your implicit assumptions are in the study design from the very beginning. You know, and so, oh, sir. Okay, so actually just a balancing point. There's actually different layers of this question. You know one is what should be published and I think we could be actually very liberal in saying that many of these things can be published but there is a different layer of evidence on maybe what do you use to console a patient and what establishes some truth of nature that you believe as a fact. And it's reasonable to say that in any one study you could decide to use whatever weights you want and come up with a very interesting story and publish it and use the biology and whatever. But we should also be clear that very likely that single study probably doesn't get to the level of evidence of we have a truth of biology and maybe we should go and counsel patients on that basis. So I think you could imagine a range of levels of evidence for these questions and I think for the ultimate level where we think this is a definite true finding, it's important to be quite strict or we will just have a jumble of things. Yeah, I would reflect that idea and as well what sort of Mark was saying about being very clear about what hypotheses you ask and pre-stating those up front because there are gonna be clearly cases where you can make a strong case given segregation given how you understand the phenotype and that's going to have evidence and you're gonna be able to construct really clear tests of those questions and that is a different, those very really refined and specific well-constructed questions are different than other kinds of well-constructed but more challenging questions which use less of the information that is available to you and require thus more evidence in order to control if in a strict statistical sense the degree of false positivity but allow you to make more general claims or expose yourself to more aspects of biology that you didn't know about. So when we sort of go in this sort of space where we're Bayesian and Frequentist and we're trying to reconcile these things which I feel like are conversations that happened 150 years ago in completely different contexts. I think a lot of these things are really reconcilable and I think the key is just discipline about stating what those things are and trying to contextualize what that question means and what the evidence you're used, what evidence goes into that. So every question you pose in a Frequentist context I bet you you can write down a prior hypothesis, can calculate power, calculate the Bayes factor that's appropriate evidence for the model and it could be that one way of phrasing the question in a Frequentist context or phrasing the same question in a Bayesian context is more or less convenient but as long as we have really clear ideas about the way to translate that information I think we will probably be okay if we can successfully do that as long as we're disciplined in doing that. To follow up on Gonzales point I agree we shouldn't actually aim for keeping findings out of the literature that are not statistically completely convincing. Instead what we should do is aim to have every genetic finding in a paper include the comment that the genetics alone either does prove pathogenicity of the variant or variants in a study or a gene or doesn't because what we have right now is a situation where a whole bunch of different kinds of evidence are pulled together and you don't really know which evidence is supposed to be primarily carrying the story but if you actually require that the author would say I am asserting that the genetics alone is proving the case and here is the reason I think that would actually really help in the interpretation of the papers and you can often do that with absolute clarity. I mean as Shamil says it's straightforward to ask if you're taking a gene based approach whether you have evidence considering the number of genes and you have de novo mutations in X out of Y patients in the same gene and you need to have Y as well as the X which you don't always then you can actually put a P value to that without much difficulty so I think that if we actually simply required that statement the genetics does or does not prove the case by itself before you get to interesting stories about the biology that'd be really useful. Well and David it might be not only does the genetics prove it but prove to us that it does prove it so lots of people say sure no problem but you really want, okay what's your evidence that it does prove it, yeah, yeah. I mean I think that's an extremely good point and there can be a real benefit to the entire field to adopting that kind of an approach in how you write your publication and I think those four autism de novo papers are a great example of that in that there is now you can, any of us can get access to all of the de novo's that were found in those thousand samples and the next person who sequins a hundred trios might be the first one to actually if they get lucky and are the third hit in one of the genes or something they're the ones who can for the first time have a P value that they can be confidently say we now believe this is the gene and it's obviously only possible because those earlier projects have generously put those data which are not yet genetically bulletproof in the public domain and so I think that's definitely something to be encouraged because it's, I mean, it will much more rapidly progress the ability to make those genetic proof statements. I totally agree with this point about making sure we do get things in literature and Joel's made a very similar point as well without hitting that high bar but I guess a key point is that we also make sure that as those literature results then into the databases that that level of confidence is carried along with it and that's not straightforward. It's not easy to see how that would happen but I guess in the no invariance session we'll be talking to some extent about how you might design a database that actually carries some of that information along. Yeah and I just wanted to chime in with agreement to David's point. I think we have a, you know, we really backslid if that's appropriate past tense. In the last few years we've gotten into sequencing because we had moved to a model where all right, genetic findings are, you know, we're gonna lean on them, we're gonna analyze them very clearly, we're gonna prove that they're significant and replicated and then we're gonna move on to the functional studies and that we've sort of slipped into a mode where well, here's some genetic data and then we look over here and here's some functional data that says these are good variants and then we'll move the curtain back over here back to the genetic data a little bit and it's very difficult in a lot of cases David points out to actually follow the thread as to what's, you know, what's the tail and what's the nose or who's wagging the dog or whatever because you look at these papers and you don't actually know what's the most important, what's the compelling piece of data or evidence that is making me excited about, you know, this work or this relationship to disease or what have you and you know, whatever it is it needs to be very, very clear and as Daniel said it becomes even more important as these things push into clinical databases for interpretation. Good, so it seems like we have had a very good discussion of this and I'm trying to, we did get your last comment? Yeah, okay, I haven't missed anybody. Great, we were trying, we did pick up some extra time and we're trying to spend that over the day rather than use it all up in one session so what I might suggest is we take our break now it was a 20 minute break and so we come back at 10.30 but we're gonna start at 10.30 so we'll be sure that David that you're here and we'll get going at 10.30, thank you much.