 We've heard a lot of, had good discussion this morning already about how to design and analyze genetic studies and then how to use computational methods to annotate and help to prioritize variants. And our group was charged with how to take the next step or how to apply experimental methods to identify ways in which investigators can test whether their candidate variant might have a biological effect and then what are the considerations for implementing these types of methods. I think we're all clear on what the different motivations for doing functional analysis would be. Example scenarios are such as when you have a broad GWAS peak and you need to help narrow down the potential candidates within that region. Within the clinic, I'm trying to understand the different consequences of variants that might show up when you don't understand the significance of those variants. And then moving beyond variants, working to understand the function of genes that you've not previously characterized or seen before into developing functional assays for those genes. But what we believe is the need here is that we can have a generalizable, set of generalizable, readily accessible and high throughput methods and resources that will help to enable the community to do functional analysis for both coding and regulatory variations. A wide range of experimental methods exist and they vary both in their ease of execution as well as the strength of evidence that are provided by that method. And so then how do you select the most appropriate method? What should you consider? I mean, one important factor is the type of variant that you're studying. Is it a regulatory variant or a non-coding variant or a coding variant? Is it a loss of function variant, for example? Another important thing to be considering is the context within which you will be testing. It's known that the cell context, the genetic background of a cell in which you're testing, the developmental stage of the organism or the cell that you're working may influence the outcome of your test. And with that in mind, then one of the challenges for doing this type of work is the access to the appropriate types of tools. Do you have samples that have the right genetic background or that adequately imitate your cell context in vitro, for example? Can you get customized reagents that have your variant within them or in the right gene? And a lot of the techniques that we're discussing are fairly specialized. Do you have access to collaborators that can do that? I think for a number of genomics investigators in particular, when you've been more focused on computational and DNA-based research, moving into model systems and direct functional testing is a bit of a scary prospect. And in all experimental designs, the obvious considerations are throughput, time, and cost, but this can have particular influence here, especially, you know, you can imagine, for example, in the clinic that you're not going to be able to take the time to create an entire model system based on a given variant to help you interpret that variant. You know, in the case of GWAS, for example, you know, if you have thousands of different variants that have come out, it's just not pragmatic to be able to test all of them using an expensive and laborious assay. So and then another thing that we really want to emphasize in considering this is the weight of evidence provided by the test. I think there's lots of different tests. Some of them give suggestive evidence for, you know, in support of your hypothesis. Others are more demonstrative or strong evidence. And then finally, this has come up several times today already, but we urge caution when interpreting the outcomes of these tests. And this is important for both negative results and positive results. In the case of a negative result, as we talked about with context, you may not be seeing the effect, but it doesn't necessarily mean that your variant is not contributing to your phenotype. It may be that either your test or the setting in which you are testing is just not conducive to seeing that. Similarly with a positive result, being able to show a biological effect that, you know, results from a variant doesn't necessarily mean that that biology is influencing the disease state in which you're studying. So here is a selection of methods that we compiled ranging for a variety of different variant classes. This includes animal models, you know, knock-in or knock-down, genome editing such as talons, and looking at some classes for loss of function alleles, doing knock-down in an animal or in cell culture or CDNA complementation. There are splicing variants which are observed. There are a range of splicing assays that can be done in vitro, ex vivo, or in vivo. And there are also, depending on the gene or protein that you're looking at, a number of biochemical or cellular assays that might be suited to that particular assay. Then looking at regulatory variants, it's possible to, as we've discussed with the EQTL discussion earlier, to correlate with gene expression, either in a directed test, with RNA from your patient perhaps, or by looking at reference databases, and it's also possible to test this using reporter constructs. And so rather than go through each of these methods individually and in detail, we thought what would be more useful today would be to actually show some very selected, real examples of how this has been applied to genetic research and some of the advantages and disadvantages there. Okay. I'm going to give some perspective on regulatory variants. The functional analysis of regulatory sequences began in the early 1980s, followed an interesting trajectory which allows us actually to formulate a proposed framework for looking at levels of evidence that a company claims about regulation. And this trajectory essentially began historically by using non-cellular assays. It moved to early cellular assays in the 1980s. And then finally completed its march towards trying to encapsulate regulatory function in its native context in the genome in the 1990s and continuing on. And what I have done here is arranged levels of evidence that recapitulate this in vivo evidence from in situ models, meaning that you've got essentially the complete deck of cards for as complete as you can get it for gene regulation in its local context. And I'll give some examples of these. And then another level with evidence from artificial constructs, which of course allow higher throughput. And so there are obviously trade-offs between these. But the basic idea is that there is some level of genetic data that feeds into it. And then you can try to use these different techniques to assess whether the particular genetic variant is functional. So I'll give a couple of examples. So with level 1a, there actually are very, very few examples out there. But probably the most famous regulatory variant and also the very first one discovered relates to a trait, which is called the hereditary persistence of fetal hemoglobin. Very simple trait. You normally have globin switching. Your fetal hemoglobin turns off. And here's the time of birth pretty much goes away so that in adults, if you stain fetal hemoglobin in your blood, you basically have none. But there are some individuals walking around which have a bunch of this in which this process didn't quite shut off. And we can follow the scientific trajectory of this. Started in 1985. This is Francis Collins here as the first author with the discovery of a bite looking at genetic evidence of a variant that was segregating in a family with a trait that landed in the binding site of a transcription factor. And this is based purely, there were two papers that appeared based purely on sequence evidence and reporting that this variant was correlated with the trait. Next up is the test. And I draw your attention to the words is the cause. And so this was actually a specific test whether this variant in an animal model could reproduce something that was close to the phenotype. And finally, this was actually tested by taking a huge piece of DNA, 200 and something kilobases, engineering a single point variant in there and testing it in a model of mice, which actually completely recapitulated in the normal case, the wild type human phenotype. And we're also able to completely recapitulate the human trait. So that's a 1A. Now we're not going to go into detail, but there are other ones that have appeared. This is a paper that appeared last year in science. And this is an example of a level 1D. I draw your attention again to the word cause, because there was causal evidence, but at a different level. And this was something where in situ examination of regulatory binding and coupled to other assays was able to disclose a very strong evidence for the cause of a particular phenotype. So I'll turn it over to Len. Okay, so I'm going to spend the next five minutes focusing on some mouse studies. And I think John provided some nice examples of really as close as you can get to formally proving human variants having effect on human disease. What I'm going to do, I think everyone's aware of mouse knockouts. There's probably 15,000 mouse knockout lines that exist in ESLs and many of them on the hoof. And clearly, if you have a nullly on humans and you can find a specific phenotype that's also an annulment mouse, it provides functional support. And so I'm not going to talk about that. What I'm going to talk a little bit about is mouse knock ins and site-specific integration. So there hasn't been as much mouse knock in studies where you take a human mutation and put it into the mouse. So clearly this is one line of evidence that can strongly support human mutations causing disease. I just cited three examples here of people putting in triplet expansions into, there was two papers that put it into the Huntington locus and one into the SCA1 locus. And in this particular paper, there was a short about 50 CAG repeats that was put into the mouse version of the gene in the position where the human expansion occurs. And they were able to show that there was these nuclear inclusion bodies that formed in neurons of relevance to Huntington, though the animals didn't get an outward phenotype. A group later put in actually 150 repeats, so many more repeats. And these animals had many more of the overt phenotypes that you see in Huntington patients. And then finally, there was another triplet expansion that was put into the SCA1 locus. And these animals with expansion had profound motor coordination skill defects. In this particular case, all of the genetics behind these genes causing these phenotypes is iron clad. And so this mouse model is more or less providing a substrate for molecular experiments to be performed. But nonetheless, this is if you didn't believe these particular studies and had this kind of evidence that could provide further support for function. There also have been a few papers where they introduced point mutations. This is a paper where a single amino acid change was introduced into the presynology. And while these mice also didn't get Alzheimer's disease, at least that anyone could detect in animals, there were various types of neuronal sensitivities when challenged with various types of substrates. And so again, this isn't formal proof, but it does show that this point mutation does have phenotypic consequences in vivo. I wanted to talk a little bit about mouse site specific integration and not to highlight the work that we've done, but to make the important point that this is a tool that we can use and I think it's been underused in the past. This is some technology that was developed by Oliver Smithies and I'm going to just talk about some work where we used this. When I was a postdoc, we identified this gene ApoA5 and we showed that there were two minor haplotypes that had a much higher triglyceride levels in the human population than the common haplotype. And we also showed that if you manipulate the levels of this gene in animals, you can have a profound effect on triglycerides. And so it was a link from mouse studies to the human phenotypes that we saw. But we were left with what are the variants that actually cause this effect. So we had very, very strong genetics. We had good mouse data, but we didn't know, in fact, what are the nucleotides that cause these changes. So one of the things that we wanted to do is try and put these haplotypes into the mouse system. And I'm not going to go into great detail how this works, but the technology that was developed allows you to actually put your favorite gene into the same specific region of the mouse genome in the same copy number, in the same orientation. So what we were able to do is build three mouse lines that had different haplotypes of this 10KV8O5 gene. And in one case, we had the wild type version. In another case, we had seven nucleotide changes that defined this haplotype link to human triglycerides. And then in this third haplotype, there was a single nucleotide change that had this putative signal peptide change that we also introduced into the same site in the mouse genome. And we wanted to ask the question, well, what happens to the levels of RNA from this human gene and what about the levels of protein that get into plasma? This gene is specifically expressed only on liver, and we know that it's an apolipoprotein that functions in plasma. So we showed that if you look at the RNA from these different mouse lines, there was no differences between the three different haplotypes. But when you looked at the amount of protein that got into plasma, one of the two haplotypes that are linked to plasma triglycerides in humans had a very, very reduced level of being secreted. And we know that there was only one single nucleotide change that occurred in this particular variant. And so it provided strong support, though not necessarily definitive, that this nucleotide change is likely the causal mechanism by which this association manifests itself. So I wanted to highlight this to give people a sense that there are methodologies where you don't have to necessarily make a knock in, but you can put your favorite sequence in a predefined position in the mouse genome. It's much easier than doing traditional targeting, and then you can use these ESLs to make animals. The final example I wanted to talk about is also some work that we were participating in. And this is this 9P21 interval in coronary artery disease. This was one of the, I think many people find, the greatest successes of GWAS studies where by looking at individuals with coronary artery disease compared to controls, there was this region on 9P21 that was linked where a very, very common allele that's found at about 50% frequency in the population has a modest increased risk of about 30% if you're homozygous for this change. And the interesting part was this association manifests itself independent of plasma, lipids, high blood pressure, any of the known risk factors for coronary artery disease. And it's also interesting in that it falls completely in a non-coding interval. There's no protein encoding genes in this interval. So we knocked this particular interval out and were able to show that these two genes neighboring these cyclone dependent kinase inhibitors were dramatically reduced by getting rid of this 60KV interval. To make a long story short, we also showed that cells from heart tissue had increased proliferation due to this mutation, again consistent with the loss of this cyclone dependent kinase inhibitor. But at the end of the day when we published this work we only could say that there was this large interval that had an effect on neighboring gene expression but we really don't know what was the molecular nature of that particular event. So following our work, Kelly Frazier had a paper come out and what they did is took advantage of ENCODE data a few years ago and it's hard to see but this red region is that coronary artery disease susceptibility region and they were able to identify all these little red blocks which were candidate regulatory elements based on all the information that existed in ENCODE. And what she was able to observe is that in one of these putative regulatory elements there were a large number of variants and a couple of them affected the stat one binding site. And so she went on with this information to look and showed that this particular element does bind stat one based on chip seek and that binding can mediate differences in the expression of these cyclone dependent kinase and going and looking in lymphoblast cell lines if you look at risk-capulotype individuals versus non-risk-capulotype the risk-capulotype had a reduced level of stat one occupancy. And so this formulated the idea again that there is these molecular changes and regulatory elements that occur. It's by no means definitive again whether this is the exact mechanism or not is still unclear but I guess it shows you the power and the different approaches that one could take to try and get at this information. So I'm gonna shift gears and hand this over to Jay. Okay, so I just have a few slides on how the, I guess the nature, the manner in which we collect experimental data might evolve in the future. So these are some of the motivations that I think Wendy had on an earlier slide. So we've got GWAS peaks and wanting to go to causal variance dealing with functional variance of unknown significance and then just more generally transitioning from the genetics of identifying a genetic finding to actually understanding the biology. And I think in one thing that's common about all of these is that we probably have more than we can handle with respect to the number of variants that were, or had a number of variants that we would like to study experimentally relative to the expertise, capacity and dollars to actually study them. So one thing that I think is an increasing trend we're seeing in a number of places is thinking about high throughput or massively parallel ways of assessing the consequences of variation. And you can broadly divide that into regulatory encoding as different methods, but I'm only gonna talk about regulatory methods and then also observed variation or naturally occurring variation versus potentially observed variation. So the example here, this is from some of John's work with ENCODE is essentially using ENCODE data to functionally assess observed regulatory variation in a high throughput way in the sense that you're looking at lots of sites at the same time. And so the particular examples here involve variation and chip seek signals. So that correlate with the genotype of the individuals in which the chip seek data was acquired and in a way that makes sense when you look at the nature of the change relative to the motif binding logo that's expected for a given TF. And one of the points here to distinguish this from let's say the classic EQTLs is just that here the implicit assumption is that we're not linked to the causal variant, you're actually measuring this precise variant as the most likely cause and actually impacting the peak height there. And then another point from the same sort of analysis that came out is just this came up earlier but I think is again an important point to make is that conservation is an imperfect guide to regulatory function. So here even looking in motifs you get a pretty good AUC but it's certainly far from a perfect predictor of regulatory function as measured through experimental assays. Okay, go ahead. So that was kind of observed variation. So one of the limitations of those sorts of approaches is that they're constrained to variation that is common at least currently. Another sort of tack on this and we're not the only ones sort of going after a purchase like this but I'm just using an example from us because I know it is to try to think about how we assess potential variants in massively parallel ways. So again, one could imagine doing this for coding sequences like a clinically relevant gene of interest like BRCA one where there's lots of variants of unknown significance but in the example here this will be for our regulatory enhancer. And so the basic idea is that you create a library, a very complex library of all possible mutants and let's say all possible pairs of mutations of a sequence of interest like an enhancer. Clone those in parallel to a vector and take this population of molecules and inject it into a mouse, get RNA back from the liver. And then each of these enhancer haplotypes, let's call them is linked and cis to a barcode downstream of Luciferase. And so you know which barcode goes with which enhancer haplotype so you can sequence the barcodes and then use that as a quantitative metric to estimate the relative impact of individual mutations on activity. So I'm gonna profile. So here's an example with an enhancer, liver enhancer where we're basically measuring the relative consequences of all possible single nucleotide substitutions in the enhancer in a single assay. And the other nice thing about this is it allows you to build distributions of effect sizes. So if you think about, and I think David or someone mentioned this sort of like, see your mutation has a twofold impact on expression or something like that. So what does that mean, right? Like what is that, like how significant is that? How unusual is it that, you know, you had an enhancer, you see a variant and it causes a twofold effect. And so having these sorts of distributions of effect sizes allows one to make a more statistical judgment about how unusual candidate variations are in terms of causing biology. So and again, just to reiterate, I think these methods currently are being increasingly enabled for regulatory variation, but principle could be applied to coding variation as well. So key points just to kind of wrap up for the group, experimental data can be very useful. And I think they're a good examples of how it can be useful in a number of different contexts. It's inherently a subjective exercise. No experiment is absolutely perfect. And this is a difficult to quantify. You know, a positive result, as Wendy said, is not causation. A negative result is not non-causation because of context problems. And of course, multiple lines of evidence are always better. And then in general, there's a need for more high throughput approaches. So I guess we'll stop there and the last slide just has some discussion questions that we can try to get answered. Thank you. Thank you much. So we've had a quartet now. This is a little different. Thank you guys for being creative. Mix it up. Absolutely. No, that's super. So, comments? What we may want to do is talk a bit about some of these discussion questions. That's useful. So, could I ask a specific question which is about the trade-off between using human cell lines in experiments versus I mean, the artificial systems obviously, but at least the human systems versus using mouse or zebrafish models and how you guys would consider the trade-offs in those two cases. They're both obviously artificial in some way, but how do we evaluate which is the best approach in which phenotypic question? Yeah, it's a good question. I mean, we talked about the EQTLs or looking at expression correlation with variants as one correlative type of way of looking at things. And there's clearly, it depends on what the gene is doing, but you can do biochemical assays or more gel shifts or things if you have primary cells and primary tissues. And so we had discussions about that. But this is an area that's so hard. It's not like sequencing where you're counting beans. This is, there's an infinite amount of biology and so every time we found a case that seemed good, there was also a counter argument for things. So, but definitely like in thinking about study designs, it's clear that if you have to access the tissues of relevance from patients, it can add weight to the argument you're trying to make that this is a functional relationship. I was just gonna add the comment that I think it depends largely on the model that you have. So for example, I mean, you can do now experiments in primary human immune cells. And for certain phenotypes, that may be an ideal model, but in other cases, the mouse may be able to provide a really nice physiological model that you can actually follow, which is very, very powerful. So I guess my question was along those lines in the animal models. I mean, so I, you know, for the EQTL question, effects of a certain size might be too common for that to be really a lot of supporting information that you've got the initial association, right? But you can, you know, you can actually have the same kind of concerns about many animal models as well. And I noticed on the one of the slides it said, you know, zebrafish evidence is strong or something like that, but you can look at different traits like, you know, I study seizures and I guess one of the very most common outcomes of a knockout in a mouse after lethality is seizures. And so, you know, if I pick a gene at random and knock it out, there's a very reasonable chance that it's gonna have seizures. So exactly how much supporting evidence does that provide me? And if you think about manipulations like morpholino experiments in zebrafish, one of the most common toxicities is microcephaly. So if I'm studying a microcephaly gene and I find out that, you know, you can change the size of the fish's brain, how much does that tell you? So are there, is it possible to provide some general guidance about how to think about what traits can really be informatively modeled in terms of asking the question of whether you've got a causal mutation or not in animal models? Did you guys get into that at all? I think that many of these choices have actually been made by communities out there. For example, let's say looking at genes that affect development of the heart. I mean, congenital anomalies of the heart in human are extremely common. And the cardiology community has really focused and made great use of zebrafish models. In fact, I mean, we have a paper with some collaborators coming in cell where there was a new gene that was identified, it was previously not known to be involved in heart development. It was highly implicated actually by using, looking at ES cells and differentiation to cardiomyocytes and you went right into zebrafish, knocked that thing out and sure enough, it changed how the heart develops. And so I think that, and there are other models that are out there, but then some communities don't have very good models. And I don't know for, I mean, obviously for sort of neurological phenotypes, those are probably the most difficult thing to model and immunological phenotypes. I think one of the other points that we wanna dive on is because there's so much question about what is the right test, is for this particular trait, are you gonna get more information from an animal model or are you gonna be kind of got a red herring, that multiple lines of evidence is really gonna be one of the most powerful approaches. Both to help you, you know, you can start with some of the less expensive approaches to narrow down your hypotheses and then move into really understand causation. There is one thing though that should be considered and that is that there is a hierarchy of logic that can be applied. And in the case, for example, of regulatory variants, if it's not doing something in one kind of assay, the likelihood that it does something in a higher assay goes down dramatically. And so one can actually envision, for example, I mean, Nancy raised this really interesting point earlier, which is that at the point she was raising was let's say you knew where all the potentially functional variants that her spots are in the human genome, we still have some issues to address. But that goal, I mean, there's a lot of momentum now towards creating that baseline map there. And I think that that is something that can feed into these higher choices, but it definitely narrows your prior probabilities for sure. Concern about creating that baseline is that everything is doing something somewhere at some time in human development. And I think we really do face that that's where we'll end up heading. And so then it becomes less informative. I mean, I think it, but the issue is that you have another layer that you can put on that that makes things much, much more useful. And that is that we're not looking at the genome as a simple one-dimensional entity that this thing is functional, that's functional. But you're looking, but already you're assaying that function in a cellular context. So we know that this base is functional in TH17 T cells, for example. And that suddenly changes very significantly how you may use that information. I'm sort of curious, so for very small families with rare diseases, it seems like this is the gold standard. We talked this morning about the fact that it's difficult to make a genetic argument when you don't have enough numbers. So who decides what's sufficient to use this when it's the primary method of proving that something is causative for a disease or pathogenic? Should that be journal editors? Should that be the field in general? Should that be specialists? I was just wondering if you had any thoughts about that? I mean, I think it's not a black and white answer. I think that the more specific the phenotype and the rarer it is, the more power you can put behind it. But there's never gonna be that black and white. You always can get burned. And people have been burned in many publications where they have genetic finding and then they follow it up with something that's either in human cells or in a mouse. It all fits together and then next thing you know, someone comes in and does a follow-up study and the initial genetic study was weak. The functional data was consistent with but wasn't causative and it's just the nature of how much confidence can you have when you publish a paper. And it is hard with one family and one functional assay. And I think you have to always be cautious. But I don't think there's a black and white answer because again, there's so much biology and each system's gonna have its own caveats. I think there is a problem out there though in terms of claims of causality. So I highlighted in a couple of those articles I put up just the use of the word cause. In those cases was more justified but that caused me to actually do a little searching in the literature and which turned up a shocking number of papers which had which were claiming causes in the titles and they were all over the map in terms of the type of evidence they were providing. So some standardization of that which can be certainly enforced at the level of journals so it would be very useful. I think there, I'm sorry. I think there is increasing pressure whenever you publish a novel genetic finding to have functional data to support that but as we highlighted here, I think it would be very helpful to have some standards and guidelines set up for what that is. Now it's not gonna be the same for every gene or every disease but some guidelines for what's an appropriate level of evidence. Because I tend to think that you can show a biological effect for almost anything if you try enough tests under enough conditions. I just wanted to pick up on a comment that Len made that this idea that if you publish say that a genetic finding in it doesn't quite meet some burden of causality but then some other group goes ahead and does a functional follow-up which then adds to that and the sort of combined evidence suggests that you have found a causal variant. And it's just made me think that do we need to have some kind of a framework or a form of words that, because you have this sort of balance between you want to put that not quite definitive genetic thing out there so people can follow it up but you also want to avoid some of the things we talked about before putting things out that are so speculative that postdocs waste their entire lives following up a finding that essentially is false. And how do you find that balance? So I think if you say, oh, we're only going to say things that are completely and totally known to be causal then that's a set that's basically empty. So you want to keep putting information out there but don't want to run the risk that people coming at it from a different angle who don't maybe can interpret that conditional statement as critically end up sort of pursuing it when it turns out to not really be true. And I think there are some mixed messages in the group as a whole about how we trade off functional information and statistical support. I mean, we hear, so we've heard functional data certainly shouldn't be used as a substitute for compelling statistical support. So, and we've heard, and this is clearly true, that weak statistical evidence and weak functional evidence doesn't combine together magically to create a strong case. But at the same time, it is clear that functional data does provide some additional evidence that will occasionally push things over the line. I was just wondering if we could maybe try to clarify that point a little bit more, like how exactly are we thinking about combining these forms of data? So, I mean, I think just along the lines I think that the comments were made already. I think the specificity of the functional assay, which is often not really tested or provided when people provide functional data. In other words, if you were to pick the 10 genes that are almost equally likely to be candidates or some set of 10 random genes or maybe 10 genes that are expressed in the right organ system or something like that, how many of them if you tried equally vigorously on those, and of course you have to, people have to really try equally vigorously, would show a similar functional assay. So you can in some sense almost get like a p-value, not for the assay itself, like the Luciferase was up with a p less than 10 to the minus six because you did the assay 100 times, but a p sort of based on picking genes at the random out of the genome, how often would you expect to get that? And that's the sort of data that I don't think we have for most functional assays because of course it's hard enough to do the one gene that you're interested in and just then say, okay, we'll not do it 99 more times so we can get an empirical p-value, but in some sense that's kind of what you need. Well, I mean, David, I guess made a very similar point and the thing is for some, in the case of knockout mice for instance, we do actually have some empirical data where there are hundreds of knockout mice that have been generated and we do know what is the frequency of seeing epilepsy in those cases. There's ascertainment bias there I guess, but even so you could actually come up with some estimate based on those. But I mean, is that a recommendation that we'd like to make is to consider, consider the outcome of a functional assay in the context of the world of experiments that have been done on other random genes throughout the genome of that type? Either admit that you don't know anything about it or to say that there's some data, but just to be explicit about it, absolutely, yeah. Yes, sort of. I'm actually going to take the converse point of view. You know, I'm a human geneticist and I firmly believe that mice are not little people, that they look kind of different and they certainly have very different biochemistry so there are several very well-described human disorders where you knock the gene out in the mouse and the mouse has no phenotype or occasionally has a different phenotype because they have different biochemical pathways, they have a different immune system, they have a different gut, they're exposed to different bacteria, et cetera, et cetera. And so I think one of the converse conversations we need to have is, do we need proof in an animal system or a non-human system? It seems to be that to get to nature or science you need to have this kind of a level of proof for a new gene, but if you don't, you kind of go down a couple of notches in the journal. And I think that there may be, there isn't the data out there to show how often a phenotype that really causes human disease is not recapitulated in the animal model and I think that kind of negative data is sometimes almost as important because I can think of several postdocs that have wasted their career trying to get a mouse model to really look like the human when it doesn't. And so I think we have to think of the converse with all of the animal models we're talking about. There are legitimate biological reasons where they may not recapitulate the human phenotype, but mutations in that gene may cause the bonafide human disease. And so I think we have to think about that negative side as well. So I had forgotten Nancy and I have Suzanne, so Suzanne, you wanna speak to? Okay, what I wanted to, about publishing results that are not conclusive, I think we're also working kind of against human nature here because, and I really don't have an answer for this, but I think we have to take this into consideration that people tend to oversell results. They like to ignore things that don't support their results and I mean, it can be conscientious, they can conscientiously doing, they can do this and know they're doing it or not know they're doing it. And I don't really see how a way to get around this problem. I mean, one way would be almost just reporting kind of like a checklist or I think people feel that they've worked harder on something and they have a lot in stake and they would rather publish it in a higher impact journal than a lower impact journal. And that's just human nature, but we really have to get around this problem of people not presenting everything and just presenting subsets of what they saw. The point I was gonna make, so David appreciates because perhaps seizures is a fairly obvious phenotype. David can appreciate that in the knockout for many genes, they, seizures are observed, but of course many of the phenotypes that we are interested in studying are simply not measured when people are looking at their knockout model for this or that. And so when we, people have now looked at a number of genes implicated for type two diabetes from GWAS studies, knockouts and conditional knockouts, you're often seeing sort of super insulin-sensitive animals, so they seem to be resistant to development of diabetes even on high fat diets. But they're a little bit smaller, they're a little bit leaner and that's true for many kinds of knockouts. And so it may not be sufficient to just compare them to their wild type littermates. We may need to look more carefully at, well, what do glucose traits look like in knockouts for epilepsy? I mean, we don't usually accumulate that kind of information, and so it's much harder to interpret the specificity of the apparent effects that we're seeing. It works in both directions, I'm afraid, or sometimes fails to work well in both directions. I think these are excellent points, but I think that there's two different levels of the conversation here. There's the organismal level where, which is largely geared towards the evaluation and testing of particular individual genes in some system. And then there's the low-level question of is this variant actually doing something? Because to get to the organismal level, at a lower level, I mean, can you see actually using some more low-level functional assay that's doing something? In the case of regulatory variants, we have that assay. I mean, if you're not, I mean, regulation all drains down to effectively to proteins binding. And if you don't see that effect in the right spots, the likelihood that you're gonna go farther, it doesn't make sense. Now what it also raises, though, is a different kind of question, which I suppose it applies to the organismal level as well. And that is that genes are not functioning in isolation, they're functioning in systems. And what if it's really the combination of the following variants, the following variant in X genetic background that gives you the effect? And that's something that's very, very hard to address right now, certainly. Along those lines, but just many, many assays that are being used nowadays, it really paid not too much attention to things like kinetics. So regulatory assays involve affinities, protein binding assays involve affinities, and you can get something to show a phenotype if you have the concentration of the reactants way off physiological. That doesn't mean that in the physiological circumstance, it's actually gonna do anything. That's well-shown in classic enzymology, for example. So, and I think it's probably more of an issue for complex traits where the variants are likely to be more subtle. And so those are all user-beware caveats. So I just wanted to make a very quick comment toward David Demick said most of the papers which I handle at Nature and that are being published don't have any mass knockout associated with them. In fact, it's extremely rare that the genetics papers that I see do that, that. So that's not necessarily required. But I wanted to come back to something that Daniel said, you know, this kind of stereotypical example when you have weak statistical evidence and weak functional follow-up. Now of course that's a kind of cartoon example of a paper. But what's more interesting to consider is what if you have weak statistical evidence and ostensibly strong functional data or conversely strong statistical evidence and then weak circumstantial follow-up. So how do we feel about this? Any of those two useful? Okay, you obviously think they're not, but maybe. So I was just gonna make a comment, you know. So I think one of the goals today is to come up with some guidelines or something constructive for the community. And just kind of genetic evidence at least has the advantage of you can put a number on it. Right. I think no matter what with experimental evidence it's always gonna be a subjective decision with lots of domain specific knowledge that is entirely dependent on reviewers to kind of judge as a strong, weak, or middle-ing evidence, right? But I can imagine something that sort of tried to connect combinations of experimental and statistical evidence or genetic evidence and experimental evidence to the use of certain terminology, right? So, you know, John's complaining about the over usage of causal, right? And there's other words like implicated and associated and, you know, weakly implicated, strongly implicated. And I'm not sure what the appropriate terminology is, but it does seem rational to come to some sort of decision about what it means to say something that is strongly implicated, whether that strongly implicated could be weak genetic evidence and overwhelmingly great functional evidence, right? Or, you know, other combinations kind of thing, but. I don't know if that answered your question, but. No, I think this is something that sort of was left, you know, untouched in our presentation, but it was implicit in many of the, certainly in the examples that I presented is that there was an input of very, very strong genetic data in the sense that you had a trait that was segregating and more, it wasn't just pure genetic data in one of those cases. So, for example, that second example I showed, oh, it's way back there now, but. I'm sorry? All right. This one. So, this has, this had no knockouts, no mice, but what it did have was a molecular phenotype that you could measure in people. So, people with a model organism here, you could measure and see that they were producing these different transcripts, different individuals in a family, and then using the in situ methods of finding protein binding, et cetera, one could resolve a compelling molecular mechanism to explain the thing that you're measuring, and so I think that this is part of the combination that Magdalena was talking about, that you have very strong genetic or sort of phenotypes that are measured precisely in humans can definitely potentially lower the level of evidence that you need on the other side. So, the way I think about the answer to Magdalena's question is that if there's clarity in the paper about what the authors think is carrying the story, then you know exactly what it is that you have to have them explain. So, for example, it might be the case that you have very, very weak genetic evidence, got interested in this variant because I saw it in one person, but you put that, you knocked it into a zebrafish and found that they didn't make T-cells and you were studying a primary immunodeficiency, and then the authors could say it is this functional assessment that is making the case and you can believe it because we know the sensitivity and specificity of this particular assay. We know, for example, I mean, I studied this, I'm just sort of making this up, but it's probably not far from right. We know that that's not a really common outcome of knock-ins in a zebrafish, right? So, that's not, for example, that's not a really well-known toxicity and so on. That's a very specific outcome, and it's also the case that when you study mutations that are responsible for primary immunodeficiencies involving T-cell defects, when you study those in zebrafish, they have problems with their T-cells. And so you can actually say, in this case, the functional assessment is very, very convincing for those reasons and then you accept it. And if you tried to make a story like that out of whatever the heck you were doing to a variant that was associated with schizophrenia, you wouldn't be able to make a story like that out. So at least you actually have required the authors to explain the basis for a relatively strong conclusion. Yeah, I think, I mean, I think that that's exactly right and that in some sense, there are really two separate parts of a broader story and it goes back to Greg's nice characterization of the genetic evidence is in some sense about is this mutation pathogenic and the functional evidence is about is this mutation damaging. And the full picture, the beautiful picture is that it's damaging and pathogenic and maybe deleterious too, depending on your phenotype. But you can have a true and worthwhile piece of work that sort of proves one part of that or the other part and only can imply the second piece and basically exactly echoing what David said that it's a question of what the authors or how much of the story the authors are claiming they've proven. And I think if one part of it is weak then you can't let them get away with trying to just kind of because the evidence, the other part is strong enough to steamroll through and say, and therefore it must be the disease causing thing too because that bit is still open to debate. Can we put up 26? I thought it might be useful just to remind. Mike, can you put up number 26? I thought it might have come up till now and it may be about to come up so if I steal anyone's thunder, I'm sorry. But in microbiology, there's a very well-recognised series of postulates from a guy called Koch or Koch. And David, I can read them if no one else. But David Rellman wrote more recently. It also was kind of incorporated into epidemiology. So I think epidemiologists and I guess microbiologists and others have thought about what defines causality for quite some time. And these are kind of, it's relevant I think to a number of different things that we see. Because they've addressed as a community what makes a causal association. And in this case it was a causal association between a bacterium and a disease. But I think it's relevant for our variant versus disease. And it may be familiar to everyone in the room but this was... It's a T with... Yeah, I can do that. There we go. And so these are sort of standardly always put together by epidemiologists whenever there's a suggestion that there may be a causal relationship in order to define them. And so a number of these obviously reflect things we've been talking about. Plausibility, experimentation, analogy. But dose is response also relevant from a genetic dose perspective. So maybe this may be a useful framework for... I mean we've talked about many of the things here but this may be a useful framework for us to start to think about genetic causality across this experimental to clinical spectrum. It's funny that it looks like that. The reference is originally from it's from 1965. Yeah, yes. So these are the Bradford Hill criteria. They're classic in epidemiology and they were actually used for demonstrating a causal association of smoking to lung cancer and death because you can't do that experiment. But if you read through them and think about applying them to candidate gene studies in the 1990s, people obviously just didn't pay attention to the fact that there was a well-established, logical way of thinking through this. They just decided they didn't want to do it. It wasn't convenient. So I think in many regards we have addressed these even though we don't always say that we do. So strength of association, I think people look at odds ratios and say and that's what Hill meant by that. If you have a five-fold it's much stronger than a one-and-a-half-fold and those response, we have been looking at number of alleles or number of variants that sort of combine together as it were. Temporality has always been difficult in genetics because you have it from conception. So how do you, and it would be very interesting to have some thoughts about how one looks at temporality. Obviously development is one way of considering that but there may be other ways as well. So I think we are sort of getting at this but there are other aspects in genetics that may not fit quite well into that. I mean I used to know what those might be. Actually there's nuance, because for strength of association, actually if you do a small study and you have a false positive finding, it's quite likely to have a very large odds ratio. There's a very strong correlation between reported odds ratio and complexes these studies in sample size. No question. And it's unfortunate that they're listed in that order because that's probably not inappropriate. I mean that's sort of the order of strikingness of the people. They go oh my goodness it's 20 fold, it must be real. But you're absolutely right. And so one could weight these a little bit more. So I would argue against a direct analogy with epidemiology because I think our problem is much simpler in a sense because in epidemiology if you have A and B and they correlate it may be that A causes B or B causes A or there is a C which causes both A and B. In genetics, if you have genotype and phenotype, usually unless your phenotype is willingness to work in high radiation environment or something like this, we believe that causality flows from genotype to phenotype. So a lot of the problems addressed by epidemiologists are not really relevant to our case. And I think we're in much simpler, much simpler environment here. But I think the framework of thinking about these multimodality approaches to causal evidence and needing more than one from one, from meeting several from several different boxes, maybe it's two from five or four from seven. I think that model is one that could be useful to us. I'll figure, at least it pertains to experiment. Okay, David, do much. So I'm gonna come back to the issue of experimentation. I think there are a bunch of laws and rules that kind of prohibit mutagenesis of humans. It's kind of discouraged. And so I think most of the time when we're looking about animal experimentation, we're actually talking about plausibility. And I think we have to understand the difference between proving something happens in an animal versus proving it happening in a human. So I think animal experiments have a huge value in kind of plausibility arguments. But I don't think they are the same as a true experimentation of controlled manipulation in a human. I wonder, we have a couple of minutes and Wendy had a wonderful slide at the beginning of your talk and Mike, maybe you could re-switch that had a tail, I love matrices. And so you had this nice matrix that had, whether something was strong, suggestive or whatever, what kind of a, what particular experimental evidence was relevant to what particular kind of variant. Do you remember that? This one. So would we have a moment maybe to talk a bit about you had identified several that were strong here and several that you feel are suggestive. Does everyone agree with those that the knock-ins or the genome editing, et cetera, are very strong? Or should there be more than two classes of evidence was another thing we struggle with? Also, I don't know if small molecule, I mean, it's early days in small molecule work, but are the small molecules up here? In that it's potentially a cellular assay, but it's definitely a specific kind, it's true. I think you do have to take into account what David was saying earlier when you start with mouse or zebrafish knock-in with the specificity of the phenotype. I would not agree categorically that the first kind of data are strong. I think there's nuances to that, important nuances that I've seen lots of, the term we use for the zebrafish is brain rot, see it all the time. And when you're studying CNS phenotypes, if you see a zebrafish knock-down with brain rot, it really doesn't mean much of anything. And so if they have fins coming out of the wrong part of the body, you never see that. And it's like, wow, that's significant. So it can be extremely strong and it can be damn near meaningless too. So do you have a suggested term? We might use a variable maybe, situational. I think it'd be better to have just strong with an asterisk, right? Because I mean, at some level you have to make, there is a difference between correlation of expression and mouse models, right? That's a real difference. And by the same token, I can think of some of the things where we have a suggestive like biochemical or cellular assays for a metabolic defect where it's actually really strong evidence, right? So I think there will always be exceptions to every category, and we should note that, but. So you also have a hiding. Yeah, I think this gets to a very, I think it's an obvious point, but we have assays that act on molecules, that act on cells, tissues, and organisms. And we're in this tight spot where the things that act at organisms, if you can really reliably connect them to DNA, it's kind of a miracle, right? But there's a lot of ifs ands or buts on the way up that hierarchy and there's a lot of potential noise and it's why it's a miracle is because it has, that signal has continued despite all of these assaults on it. And so I think one way to talk about this is that sometimes we're lucky in that, in some of the cases that we heard earlier, it's an enzymatic thing where the molecular assay is extremely close to the organismal response because it kind of almost jumps right up to the organism. Whereas other things like schizophrenia and diabetes and depression are incredibly integrative and therefore the molecular cellular, even tissue level may not be as compelling. So I would advocate thinking about these kinds of assays with that kind of rubric because if you have an integrative phenotype that you see in the mouse, that might be great but it might not be great in the context of other phenotypes of interest. And so that's where this asterisk that Jay was suggesting. For me, it's all about the asterisk and that's why I was trying to. Would that be solved by just saying in the context of a well-defined phenotype? No, no, because it's not just whether the phenotype's well-defined, it's the specificity question. So we had Heidi and Joel, but Dave, if you're gonna speak directly to that point. Yes, so I just, I think we should be really careful about not saying strong there, even with an asterisk. And the reason is, if we think a little bit about how people act in terms of trying to publish papers, if we make a statement that, you know, mouse or zebrafish knocking is strong, even if we asterisk it, then people are gonna refer to that when they're trying to fight for their claim in their papers, exactly like we've already seen for the 10 to the negative eight p-value threshold for GWAS, so what a lot of people I think have seen, like I've seen in review, that authors will say, no, it's just a real finding, even if it's like association with political party affiliation or whatever, because the p-value is 10 to the negative eight. And look, in McCarthy at all, if you have 10 to the negative eight, it's real, okay? That's what it says in the paper. And then so you say, well, look, but there's been thousands of GWAS that is done now. So in fact, even if we have 10 to the negative eight for every study, you know, you're gonna get some false positives. No, the authors will still refer back to that and say that is what's said in the community. And so if it's better than that, then it's significant. And so stuff actually gets through that way. So I think we have to be really careful not to say, look, if it's in a zebrafish, it's good evidence and leave it at that with a little bit of comics. People are gonna refer to it, they're gonna then say, I've done something to the zebrafish and therefore it's real. So I think I'll be really careful about not doing that. Good point. So we ought to wind up, so we had Heidi and then Joel. So I was just wondering if there's any utility in putting in another column that refers to what kind of conclusion you would make if you got a negative result on any of these approaches. You know, speaking to the question of can you exclude, you know, a role because it didn't show a phenotype or it didn't show and, you know, some of this speaks to the comments earlier about we never see the negative data, never gets published. And so, you know, when can you actually make inferences from that data versus when it's not safe to do that? I don't know if that's useful or not. Yeah, I think that's a very interesting point and we'll have to deal with it. I mean, splicing is sort of one where I might conclude a lot if I didn't see any aberrant splicing in an assay, but many of the others, for all the context reasons we discussed, I think there's always, you can always make a story why a negative result doesn't support. I would argue that actually you would not reject a clear genetic finding for not finding function with any of these assets. You know, you could always speculate that it's in a different context. Also, you know, just for interest, you know, Mark yesterday had this binding example, right? You know, varying levels of genetic evidence were published at many different times by many different people. It turns out there's also, you know, a whole literature of the disbinding mouse, you know, saying if you knock out this gene in the mouse, you get schizophrenia-like phenotypes. You know, obviously, you know, we could argue about whether a schizophrenia-like phenotype in the mouse is convincing or not, but, you know, but you would have both sides. You'd have, you know, some genetic evidence and you would have, you know, the mouse model claiming to support it. So I think, I think Daniel wanted to make a comment. Do you want to let Joel? You yield the floor to the gentleman from, yeah. Actually, I think it's been pretty well covered by Gonzalez. Excellent, excellent, okay. Great, good discussion.