 Okay. Here's where I try to convince you guys that we don't have to worry about really rare Mendelian diseases because they're 10,000 times more common than you think. So, what we've been doing is to use GTX as a way of integrating information from the transcriptome to use and understanding the relationship between gene and phenotype. So, if you think about just measured gene expression, some component of measured gene expression, sorry, is, I was looking, is there a pointer, is completely genetically determined. So, clearly other factors affect gene expression, but there's a component of gene expression that is completely genetically determined. And using GTX as a reference panel where we have genes measured for transcriptome and also the genome variation measured in GTX subjects, we can create SNP-based predictors of gene expression for each gene in each measured GTX tissue. And while overall gene expression depends on both short-term environment like what you ate for breakfast and longer-term environment like whether you exercise regularly and smoke. And of course, over a lifetime those things combine and you develop diseases and that feeds back on what we measure as gene expression. So there's a trait altered component that makes the direct comparison of measured gene expression to disease makes it very difficult to infer causality there. But if we focused on just the genetically determined part, we essentially have a one-way arrow to our trait and that's a gene-based test. So we basically impute transcript levels for each gene in each tissue and test the association of that endophenotype with disease. And that's a gene-based test that we have that we can do across many genes with high fidelity. So today with the resource that GTX is, we can assay more than 18,000 genes and show that without sample prediction there's a correlation between the genetically predicted and the directly measured transcript levels of at least .2 in at least one tissue. So for many genes they're expressed in multiple tissues and you tend to get the best readout, the best quality predictor in the tissues where you have the largest numbers of individuals measured for building your predictors. But of course some genes are expressed in a single tissue and unless you've got a quality predictor built in that tissue we can't really assay that gene. But today we can assay more than 18,000 genes with good predictive ability and we're applying this broadly in bio-view. So you guys are all familiar with the fact that Vanderbilt developed their electronic health records in the 1990s. There's this is the de-identified and continuously updated image of the EMR in about 2.5 million subjects now. Bio-view is the biobank associated with that and we have about 215,000 subjects with DNA. Today there's dense GWAS level genotyping in about 20,000, an exome chip data in about 36,000. But this time next year we'll have 3 million subjects in the synthetic derivative, 225,000 subjects with DNA, more than 120,000 subjects with dense genotyping and thousands with whole genome or exome sequencing, maybe tens of thousands. There's a lot of applications in progress. And so what we're doing is really what you might call, so if you've ever heard Josh Denny talk about phenome-wide association studies, that's what we're doing only at the gene level and trying to create this comprehensive gene-by-medical phenome catalogue. And it really is a giant knockdown experiment where we're basically looking at the medical phenome associated with the knockdown of each gene in each tissue and simultaneously an up-regulation experiment when we're looking at the increased genetically predicted expression of each gene in each tissue and reading out the consequences across the medical phenome. But it's natural variation that we're using. So rather than manipulating the human organism, we're using natural variation that just leads us to express these bell-shaped curves of not just measured gene expression but genetically predicted gene expression also. And this is really, it's a really fun playground. And it makes BioView a really great in-silico discovery engine. So the, and it really works. If you measure, if you count up the number of genes that each person has where there are at least three standard deviations from the mean in either direction, so that metric is actually, so Howard asked before, what's a healthy genome? Well, part of a healthy genome is not being in the tails of any of these predicted expression distributions. It's a bad thing to be in the tails. In fact, the only distribution that you want to be in, in the tail, is this one because it turns out that the number of genes where you're in the tails is significantly correlated with the number of THIWAS codes that you accumulate across a lifetime. So the more genes you have in the tails of distributions, the more different phenome codes you're going to accumulate across your lifetime. And you never want to be in the tails of these distributions. Bad things happen at both ends. And you probably think you do, you think you do for, say, intelligence. And you could get lucky there. And your kids could have regression to the mean. But really, do you want your kids to be as much smarter than you as they think they are? I mean, you pay for every tail that you're in. So one way or another, you pay for every tail. So this works also, we can show with CRISPR-Casso. In just the first few thousand patients that we looked at, there was a gene that we saw that was associated, the reduced predicted expression of GRIC-V was associated with many different ifenotypes. In less time than it took us to do the analyses in the next 15,000 subjects, they had it knocked out in zebrafish where you could see little cyclops zebrafish. And most of the zebrafish just had smaller and sometimes misshaped eyes. And if you then took antibodies to the protein product of GRIC-V, you could see that the protein is indeed highly expressed in the parts of the eye that generate all of these different ifenotypes. It's highly expressed in the lens, perhaps why you have the cataract. It's highly expressed in areas that could lead to retinal detachment. It's highly expressed in cells that form the sheath around the optic nerve. So it's, the biology makes sense as you take it out into model systems. But what I want to talk about today is the fact that this is giving us real new insight into the continuum from Mendelian to common disease. And people have already been talking about this. So in the Undiagnosed Diseases Network and the Mendelian Sequencing Centers, they have already seen, for example, Eric Boerwinkel and Jim Lubsky talk about the fact that the some familial neuropathies they solve, but some just have an accumulation of rare variants in multiple of the genes that have already been implicated in familial neuropathy. What I want to focus on here is more the continuum from loss of function mutations to deleterious to just reduced expression, reduced genetically predicted expression of genes. And what we see across BioView is that the reduced predicted expression of Mendelian disease genes is associated with all of the phenotypes, the sub-phenotypes, the phenome packet as it were that make up that Mendelian disease. So here's a transcription factor. Mutations associated with Nfix1 are associated actually with two different autosomal dominant diseases. Phenotypes associated with Marshall Smith syndrome include accelerated bone formation and hands and feet and fracture, diminished muscle tone, breathing difficulties. So the larynx and trachea can be floppy. It makes getting the fluids appropriately through the system harder. They're classic facial features, blue sclera, mental and motor delays, sometimes speech is absent or abnormal, intellectual disability and impairment. Soto syndrome 2 is also associated with mutations here. So again you see some overgrowth in childhoods, curvature and scoliosis, facial features, muscle weakness, but more congenital abnormalities of the kidney, heart, eyes, ears, there can be deafness, benign tumors, low grade malignancies, seizures, intellectual disability, behavior problems, speech and language disorders. So ADHD, some insistence on sameness kinds of phenotypes, stuttering specifically was described that other speech and language disorders. And so what do we see across BioView with, and the kids are living longer now because they watch for the breathing difficulties and try to anticipate and do better, but it was generally the breathing difficulties that were the most serious problem that would often lead to death and childhood from the associated pneumonias and problems with the breathing. In GTX, across GTX the gene is highly expressed in some parts of the heart, in the brain, very highly expressed in the brain, expressed in muscle, not surprisingly than the weakness that's observed. It turns out to be highly expressed in uterus and cervix. And so what do we see across BioView? So with reduced predicted expression just in blood, you see highly significant associations with sort of pelvic inflammatory problems. So inflammatory disease in a number of ways, a number of different diagnoses of that. In red are some of the classic features associated with one or the other of the syndromes. This is just in blood, but you see there's a number of other phenotypes that are also associated. So although cardiac congenital anomalies have been described, I didn't see anywhere that congenital anomalies of the esophagus have been described, but they're highly significantly associated and it wouldn't be surprising in some newly diagnosed kids with this disease to see congenital anomalies of the esophagus, which if they haven't already been described, it might make it harder to recognize that this could be one of the autosomal dominant conditions associated with this gene. Some of these other conditions that we see as strongly associated would fall under the heading of outcomes. Now that these kids are living longer, what will happen to my child is one of the questions that gets asked and it's always easier to watch for things that you know that kids will be at risk for as they grow up. And I think pelvic inflammatory disease is a great example of what is likely to be a problem in women who've been diagnosed with this as they get older. In other tissues, we see many of the other phenotypes, so facial weakness with high significance, pneumonia due to fungal infections, diseases of the larynx and vocal cords, symptoms of respiratory system. So there were a number of breathing abnormalities and breathing issues associated. You see symbolic dysfunction in speech and language disorders also associated. Disorders of the tympanic membrane, neural tube defects, kidney anomalies, and kidney disease. This had a range of significance depending on the tissue of 10 to the minus 5 to 10 to the minus 7. Also fractures, 10 to the minus 5, 10 to the minus 7 range. Caesars, seizures, convulsions, epilepsy again, 10 to the minus 5 to 10 to the minus 7 range. So essentially all of the key features of the Mendelian disease are seen in people with just genetically predicted reduced expression. But in addition, we get some insights into other congenital anomalies that might be observed. I mean, by definition these are rare and have been fully characterized in only small numbers of individuals, right? And we get some potential insights into what can happen to the kids as they grow older. So we are developing a database of Mendelian disease genes and the associated phenotypes that we see in bio-view now. As I say, rare diseases may have been characterized, fully characterized phenotypically in just a few patients. And this comes back to something Stefan said yesterday about data-driven models for the range of clinical features that we can expect from Mendelian disease. This may help solve some Mendelian diseases even without the sequencing. And of course, the need for outcomes as patients live longer is something we can also get here. Also creating a database of Mendelian genes in waiting because as I said, there are even just with careful look across a couple of tissues, hundreds of genes, not yet characterized as Mendelian disease genes, but where we see multiple congenital anomalies and intellectual disability and other really bad phenotypes associated with altered expression of the gene. This is probably one of the ways to kind of predict in advance of seeing it what de novo mutations in some of these mouse embryonic lethal genes would actually look like were someone to be born with it. It's also something, you know, particularly if you're looking before a patient comes in, you can get in advance some of the phenotypes that you might want to check for to help distinguish among different diseases because this is a much deeper look at the possible phenotypes that will be seen with mutations of Mendelian disease genes. Just quickly, I want to talk a bit about an autosomal recessive one as a canonical example of us, another set of Mendelian disease genes. This is zinc transporter that leads to in an autosomal recessive disease, acrodermatitis and terrapathica, which is also associated with other phen... So you have the skin phenotype, also associated with chronic diarrhea, gastritis, serious behavioral problems, anemia. Until the gene was cloned, it was fatal in early childhood. And then the gene was cloned, found to be a zinc transporter. Five days after zinc supplementation, the rash is clear, the diarrhea clears, the behavioral problems are reported to clear as well. And you can see where I'm going with this. Gene's highly expressed in colon, in intestine, highly expressed in kidney, highly expressed in brain. And people with reduced predicted expression have bad things. So it's reported to have anemia, as I said. We see the mineral deficiency, some of the things associated with zinc deficiencies. We see in just blood, we see the gastritis and others. But across other tissues, we get a bunch of skin conditions that are probably misdiagnosed because it's probably related to the same skin condition that you see in the Mendelian disease. But we see acute renal failure, kidney disease, chronic kidney failure, kidney transplantation. We see primary, pulmonary hypertension. You see the hypertrophic cardiomyopathy. So it's associated with other bad things. You see schizophrenia with a 10 to the minus 9 association, and suicidal ideation or attempt. So the behavioral problems that you saw in the kids with the Mendelian disease are diagnosed as schizophrenia, and you see even the suicidal ideation, and cerebral degeneration, which is what they have hypothesized is going on in the brain with the zinc deficiencies. So this is bad stuff. And how pissed off would you be if you knew if this is what you had your whole life due to just reduced expression of this gene and could have taken zinc supplementation and not had? We don't recognize this as Mendelian disease because they're in adults and they're often, so it's just reduced expression of the gene. It's not necessarily a parent at birth. It comes on over time. It may wax and wane some, but it's a serious thing for these people. So there is this continuum. There are dozens of Mendelian diseases that can be treated reasonably effectively with innocuous therapies, or mineral supplementation, dietary intervention, and there will be more people with highly increased risk of diseases. So for some of this, it's five-fold, six-fold, eight-fold increased risk of bad things like kidney failure, just so just of these genes of the ones that have innocuous therapies, then there are people who have Mendelian diseases. And so there's another whole group of people we have to go after and think about because there is a continuum of Mendelian to common disease. We don't have people with this diagnosis yet in BioView because it's 1 in 500,000. There are none in BioView now, but there are 5,000 patients in BioView today at high risk for the worst subtypes of this condition. 300 patients have multiple of the worst subtypes that could maybe be ameliorated with zinc supplementation. So in the big picture, there's a continuum from mouse embryonic lethal to Mendelian to other genes where we don't, where the coefficient of variation on the transcript level is lower here and higher up here. Yeah, nature's basically saying, for Daniel's list of loss of function tolerages, it doesn't matter how much or little you have. The heritability is still high even though the coefficient of variation is low. Heritability is still higher here than down here. And these contribute disproportionately to phenome burden. And the Mendelian diseases fit on the major axes of disease risk just the way the other ones do, raising the possibility that at some point rather than gene specific that we'd work on these axes, wound healing innate immunity, TGF beta signaling, apoptosis and growth and be able to treat that way. So we definitely see this throughout with these predicted expression phenotypes. And this was results on all genes in about 18,000 after QC from the 20,000. But we'll have much larger sample sizes as we go up and much better granularity to do this. So just then our, my colleagues planning the bandwidth Dan now and Lisa and Eric did most of the compute and Anwar keeps the computers running. The zebrafish group and BioView is such a fun playground. And our G-tex colleagues and the G-tex project which is just a fantastic resource for doing this. So happy to take questions. So we'll take, because of time we'll take one question if there's a specific question for Gail and then we'll go back to the general discussion. Sorry I'm looking at Gail, sorry. Nancy. We should have called her Dan. Dan. And I don't even have like the flight jet lag thing to explain this so so we'll take one question specifically for Nancy and then we'll go into open discussion. Howard. So Nancy this is the second time I've heard the talk and I really like it. But as I listen to it it makes me pause for a moment and say so what's the spurious nature of those associated phenotypes do you think and how do we sort that out? Because it just makes me nervous that with that one gene that there's so many phenotypes that are clustered around that and you've got high p-value so I see that but I'm just worried is it how much how much of it is is true how much is false positive and I guess how many of these have you checked to know to know that? Yeah okay so so the the one gene that we've gotten all the way through zebrafish validated that's a new that was a novel observation of phenome to gene association and it was one phenotype though so no no it's multiple it was multiple i phenotypes yeah that was why we put it into zebrafish and and it's clearly that gene is clearly important for normal i development in zebrafish and that protein product is expressed in all of the parts of i that affect the phenotypes in humans the i diseases in humans with which it was associated that if we the conservative bond for only correction for the number of genes we can interrogate in the number of tissues where we get quality interrogation by the number of phenomes phenome codes that we look at is that would be about eight point three times ten to the minus eight if we use permutation because a lot of these phenome codes are actually correlated it's probably more like seven times ten to the minus seven but even using the eight point three times ten to the minus eight there are there are highly significant associations the reason there were so many with these genes is that Mendelian disease genes and mouse embryonic lethals accumulate much more and much more significant phenome association than genes do on average believe you me i got plenty of genes that have no phenome-wide significant no no significant associations with any phenome and and other of the genes that are not Mendelian not mouse embryonic lethal often have piddly you know if one crosses the threshold you could be lucky so it's not that all genes look like this Mendelian genes look like this mouse embryonic lethal genes look like this and in particular the the zinc transporter looks to be on this axis of a number of other genes associated with many of these same phenotypes but it's got a much stronger effect because it's a Mendelian it's more central to sort of normal health and development than most of my polygenic genes on this same sort of axis so there's a lot of things we're learning and one of them is that Mendelian genes are really special they they are central to normal growth and development and health and messing around with those in either direction is not a good thing for human beings and every subvenotype that makes up a Mendelian disease syndrome phenos packet is a common disease i mean many Mendelian diseases are associated with kidney failure kidney failure is going to be associated with a reduced expression of most of those genes because it's but but later on set and and not in every single person with with the loan because it'll depend on the rest of it but you will see a lot of the same phenotypes associated with reduced expression it's going to account for much more of medical phenome than we've ever appreciated okay um so i now i have lots of people here hang on one second so i had wendy and i have lea so but i want to now move into the general discussion for this session i thank melissa and nancy for really thought-provoking talks um but i want to back up to the people that were in queue so jose and then peter and then calum and then wendy and then lea