 Hello everyone. I'm Rob Williams from the University of Tennessee Health Science Center in Memphis. I'm delighted to be here. I represent the complex trait community and My I've been sort of assigned the task of Discussing modifiers. I've been doing a particular context of experimental precision medicine Which is our pivot of what used to be called complex trait analysis or QTL mapping? We're trying to make this a little bit more relevant to current efforts in precision health care and human Clinical cohorts, and I think you'll see that we now have the resources that make this much more doable The other theme of my talk is going to be the fact that we actually have two Animal animal model communities. We have those of you who use mainly reverse genetic methods and those of us who use mainly forward genetics methods and That dichotomy was valid perhaps 10 years ago, but I think it's really dissolving right now So that anybody that's doing forward genetics ultimately uses comp resources Or CRISPR-Cas9 to confirm mechanism confirm allele And I think it's also true that those of you that are looking at knockout effects are more and more acutely aware of the Role of genetic background so those worlds are coming together I think they need to come together because as is probably obvious to everybody in the room This community our community surrender a lot of stress Largely and part of the wonderful success of genome-wide association studies. It's been marvelous But it has made some of what we do a little less germane than it might have been 10 or 15 years ago when it was not clear that G was was going to be so incredibly effective in Admittedly getting small effect genes out, but but still it's a big progress This is the sort of theme. I can remember Eric Lander gave this talk in Memphis in 1994 with these slides from the American Express ad campaign It's still very germane. The joke in this slide is that's only a 1.44 difference between Wilton-Willie This is the kind of pervasive genetic variation which drives most of clinical health care concerns. I Have added my version of Wilton-Willie there the two little mice on the bottom Those are two members of us of a family that differ for about six million sequence variants So I'd argue they're about as genetically variable as Wilton-Willie are The other community I have a version of this where somebody sketched in my face on that But this is this is something that hasn't been broached today, and that is the precision health care We touched about it a little bit in the context of genomic medicine and where that's going we It's somewhat pretentiously caused called precision medicine. It probably should be called probabilistic health care We don't really have a good idea yet of how accurate our probabilities will be But I think we'll get there and what I want to propose to you today is some solutions It will give us our those ROC Curves that we really need to understand prediction for This very politically incorrect group of human beings. I want you to notice. There's only one old white guy in this on this It's really it's terrible, but So the problem with humans obviously is there ends of one and that makes prediction Extremely difficult and so the question is how can we bootstrap from the resources? We have for rodent models to actually enable that kind of Bodacious prediction from end of one I think these are a couple of studies that have that are leading the way in human genetics Mike Snyder has been Torturing himself by generating his own phenome. He is probably the most deeply phenotyped Individual human right now from the 2012 paper in cell the group at Vanderbilt has been Using phenome wide association studies using the Vanderbilt bio view database We're obviously now doing this with geisinger and many other databases UK Biobank can now be flipped on its head and used for phenome wide association studies This is after 10 years where Mike is really so we're now up to a total of about 109 individuals who have been repeatedly profiled over three years for pre-diabetic rest risk Using serial transcriptomes biochemistry etc When you actually it's very common for for all of us in the room to complain about the enormous amount of data We have for phenotypes and genomes. We actually have pathetically little phenome data. This is it This is the best paper I can find on deep human phenotyping and it's frankly just two years three years worth For 109 individuals so when it comes to building these sophisticated models that we'd like to strap together with AI systems we're not doing very well frankly and We have to do a lot better. They're hard core limitations to what you can do with N of 1 human beings Fortunately for us there aren't with mice rats the elegans Drosophila and yeast so we can proof This longitudinal big data approach to precision medicine using animal models of various sorts And that's what we've been doing. I don't think I have to belabor G was It's you know, there's third everybody has their own opinion. I myself think it's incredibly valuable I think it has maybe not been as actionable as we want But that's because we're at the very start here. We're about 10 years into G was So the real question is how do we build animal models that reflect the complexity of humans? so that we can test drive predictive models and Again, I this this gets pivoted Every once in a while to to make sure that the community frankly gets adequate support from NIH and NSF and and European agencies etc. At one point and still is known as systems genetics basically a many to many to many We have many gene variants Many many interactions among gene variants and many environmental factors social factors Just the realization that life is complex and we can't dodge Clinicians can't dodge the bullets. Those of us studying animal models can often dodge that bullet We'd like to get into the phenotypes Deeply, we'd like to get the end of phenotypes and again, that's going to be difficult with with human models I want to introduce the idea really is to reintroduce the idea of replicable isogenic panels I know Gary Churchill was here last year talking about the diversity outbred I am not a great proponent of that except as an expedient today The reason is it's not a replicable resource So every one of those do animals like every one of us is genetically unique So we can't accumulate the vast phenome like you guys have done for comp for one strain of mouse That's an incredible resource to have it all basically Integratable data because you've used a common genome So the replicable is really critical the isogenic is really critical One of the problems is they have always been inbred and I'll show you a solution to that they needn't be inbred There's a great solution, which is an old quantitative genetic Cross type called the dial L cross. It has nothing to do with alleles by the way It means parallel and dial L. So it's just the opposite of parallel So that's why it's missing that E and we now have wonderful resources to do an absolutely massive Virtual dial L cross we have well, you'll see more I'll show you but again the the motherhood and apple pie integration that everybody and their mother has been talking about since about 2000 how do you actually deliver that what are those little arrows on that that slide represent to me? They represent mutual information correlation and that requires a sample size That's reasonably large preferably fifty a hundred a thousand ten thousand The beauty of replicable isogenic populations is that you can study G by E G by G G by E by drug by developmental stage Because again every individual or every I should say genome type is replicable And I'll show you some tools that we have So I think these are sort of the key substrates that we need for experimental precision medicine over the next 10 200 years Whatever it takes These are the cute little devils that I've been using Inherited them from Ben Hillard the Jackson laboratory when he retired in about 2000 He retired there were about 35 of them. We now have 150 BXD strains The isogenicity is kind of obvious to this community the fact that we have a hundred and fifty strains that are derived from the Mother that you use C 57 black six the father is dilute brown agouti Dba 2j And then this is a family that's segregating for as much as probably all almost all of European ancestry So again, there's six million variants banging around in this population with minor allele frequencies very close to 0.5 so it's very well powered And reasonably precise for four genetics Studies, but the key thing is we can build a massive phenome of the type we need to actually do prediction over the next 20 years Everything that we've done Has been ever since Zerhouni started nagging the community about being translationally relevant We've tried we've tried our damnedest to be translationally relevant by not obsessing on the mouse allele So if we get a Canada gene that we think is relevant we just hop over to Human GWAS studies as quickly as we can to see whether we can confirm refute refine the locus Using GWAS so the first one over there on the on the far left side is a study of blood pressure control and mice then An analysis of a finished military cohort just to show which gene is likely to be the candidate in there So we're actually using humans to fine map our mouse loci We collaborated with Josh Denney to do the first joint mouse to human phenome wide association study again bring those two worlds together and In in this case us the study on the bottom is one in which we are now actually getting to clinical Care in humans with glaucoma with a candidate gene that was initially mapped and this all happening happened And while you can see the date there 2017 and this is now in early stage clinical trials for glaucoma So the simple family pedigree Mom dad, you know mom, you know dad is DBA to J We have the same resources that we're building for rat because we don't think n of one mouse is going to be good enough Ideally you'd have mouse rat or soft or whatever you can ford We just really have to get we've finally gotten away from everything shall be male at 60 days And now everything shall be male and female Maybe over the next couple of years will will be able to increase the ends for father So mom and dad we make the f ones the only thing I want to say about the f ones is just notice that they are Isogenic but not in bread and that's going to be critical in the next couple of slides Because I'm going to show you that we can make a massive dial L cross That's a virtual cross and doesn't cost us anything to make it all of the members are sequenced So right now we're up to a hundred and fifty of these BXD strains It's not ideal in the sense that it doesn't incorporate the complexity the collaborative cross collaborative cross is ten times more complicated It's more complicated than all of humanity crammed together 50 million alleles here with minor allele frequencies above point one. So it's It's a mouthful And it's particularly a mouthful because only 50 of them really survived the idea initially was to generate 1,600 of these but we reached too far and took essentially the mouse equivalent of a gorilla orangutan And mushed them all together and bad things happened. We should have known we should have listened to evolutionary biologists at that point, but This goes back to 2001 when we've got the collaborative cross started Anyway, so what we have are Resources that are sort of like Finland maybe a little more diverse What we'd like to do is reach down and actually cover all of humanity This is a problem. I already just alluded to the two animal model worlds that we have to bring together nice review from Johanna orcs this group showing how you can do that The convergence of reductions approaches and systems approaches The conclusion for me is our community has to work with you guys. So right now sort of again parallel play We have to come together If you want to be venal about it just to defend our turf to make sure we're relevant to NHGRI missions all of the IC missions. I think we're losing that battle I see it for sure in in Some of the institutes basically Josh Gordon didn't want to know much about mouse models anymore at all For psychiatric disease So I think we have to really come together and collaborate and make sure that the translational relevance of our models is not in question And it is painfully in question now work with human geneticists that's kind of obvious and a lot of us don't do that There are not many people who straddle straddle multiple species and we need to work on that a little harder Okay rodent sequences titrating complexity big problem some communities try to go for Mechanism and that's really what they live for And so keeping it simple makes a lot of sense because you get nice crisp results But like Carolyn said sometimes they don't generalize. Well, they're just not robust if they're on black six Yes, maybe it's a very robust on black six, but as soon as you put it on a 129 Will it still be robust if you do it in a mouse? Will it be robust enough to translate to a rat or drosophila? So try titrating the complexity is a big deal And what I'm going to show you is down there at the bottom. You'll see something that says dialogue cross It's on the complex side But it's complex with replication and that's the important distinction between a dialogue cross and the and the diversity Outbread so the distinction here is You'll see the diagonal there The diagonal is if I breed black six to black six one to one I end up with a litter of black six and if I breed DBA 2j to DBA 2j I end up with a DBA 2j to strain so that's the inbred diagonal That's where almost all of our biology has been accumulated over the last 20 years The phenome project is all inbred strains Everything I've done frankly with one or two exceptions has been inbred strains, but inbred is not good Frankly what is good is isogenic the ability to replicate and if you go off diagonal there you can make any one of 62,250 F1s in reciprocal pairs so you can swap parent of origin effects You can make as many of them as you want. They're virtual They don't cost NIH a penny because all you need to do is keep 250 Strains of mice happy at jacks. They're already there. They have way more than 250 so from 250 strains I can say given the fact that you've sequenced all of these parents I know exactly what the sequence is of every one of these F1 progeny down to the long variants the mobile element Polymorphisms we've now just finished doing link tree Sequencing of 150 of the bxts. We're doing the same thing with the rats So any one of those off diagonal animals has been deeply sequenced The parents have been deeply phenotyped and I'll show you that Can you predict the outcome from that mouse of that mouse? What's its phenotype? That's where we need to be we can't just play around with mechanism for the next hundred years We have to deliver clinical care efficiently I would argue that your group is doing a really good job because you're working with relatively strong effect allele So I I don't begrudge you your successes But when it comes to things like type 2 diabetes cancer susceptibility modulated by non-semitic mutations heart disease It's it's a big ugly world out there neurogeneration the generation of big ugly world those are the diseases that actually run up the bills and We need to be able to predict susceptibility earlier so that we can reduce the bills to focus on the health care rather than the medical care So that's in a nutshell what the what the dialogue cross or these Reputable isogenic is about there's a problem here So even when we dreamed up the collaborative cross in 2001 this idea of doing the dialogue cross was on the radar It didn't happen for many reasons, but we have 62,000 is a big number We obviously can't afford to do that, but we need to a training data set So we have to acquire data along the diagonal then acquire data off diagonal for training And then we can do our testing with this very large ocean of off diagonal space to work with What we can't afford is to have everybody pick their own subset of the dialogue cross my study your study Because if they do that then there's no conjugacy in the data There's no wait for it to mate and for one to compute correlations among different data sets And that's that's this is an issue that where we need sort of I am PC style rigor We need somebody to basically beat The community with the stick and say do it this way or maybe there's a big carrot that says do it this way Big carrot work better by the way Dx gives you beautiful genetic architectures So there's some of the things that are on the roadmap for the 2020 G by G by G by G which we know we have to contend with G by E which we know we have to contend with It's beautiful because you've got this isogenic population That's huge with completely defined genome and you can ask if I put that animal on a high-fat diet What will be its impact on longevity on heart disease, etc? So just this is an old method. Dialogues have been around since about 1951 I think the first paper was by Jinx. So the the statistics are all nicely worked out Here's what they're great for I already mentioned some of the problems as social sociology collaboration is a huge problem You guys don't have it because you've been molded together and that's brilliant and it's critical Our community is all over the damn place We would you do this f2 and that f2 and this do and that so we really need to have the same kind of Organization top-down frankly because I think we'll get a lot a lot farther Okay, so if you assume we have the data, which we don't but let's let's assume we have data I tend to do a lot of neuroscience. So here here's my version of a slide We saw earlier today with lots of brains and we need to take data from those brains and weave them together and understand What's the risk of neurodegeneration when the animal gets to be 18 months of age? So we've built a tool called gene network to I'm tempted to take you through this live just to show it to you. So let me do that. I Think we still have a few more minutes Oops, I hope that's big enough This is the home for all of our data and We have not just mouse data, but we have all of GTX in here As a version version five where we're down a couple of versions lot on Alzheimer's So these are data sets where they're they're called classic eqt l expression quantitative trade locusts Studies that we've been able to fish out of the literature or been given by colleagues and including Eric Shen For mouse we do the best. We have lots of mouse families. I've mentioned the BXD family This is the family that would be part of a large dial-all cross But not the only part collaborative cross would go in here And if I now you can see I was having fun during the last talk and actually put in OXR one Let's see if my karma is good. No, my karma is not good. Let me make sure my that I'm online If I'm not online, I Try one more time here, and then I'll just do the Yeah, I'll just show you the screenshots Anyway, so some of you can probably get on I in this case I did a search for OXR one a global search OXR one in here And what that will do is generate a list if I didn't throw it away This is one particular one out of about 40 million vectors of data on on Traits in gene network in this case. There are about 1,600 traits on OXR one in various populations mouse human rat Ressauphala potato perhaps And the vector of data is down here These are expression values. We have data for proteins for for metabolites, etc And you can then ask questions about what is the distribution? So there's the distribution of phenotypes. What is the range of variations? So in this case the range of variations relatively modest about 1.4 You can get probability plots box plots violent plots. So it's just a good web service It's more than just a database. It's the analytic framework on top of that and it also includes real-time mapping using methods that are just for the kinship Differences that we can have to contend with in human GWAS studies here here are all of the sets that were looked at for OXR one I mentioned that there are about 40 million traits that were examined and in this case We found 1880 I'm sorry 18 18 traits that have something to do with OXR one So we can then ask questions about its genetic modulation So these are the this column that's labeled max LRS is actually the log score Divide by 5 and you get the log score. These are very strong linkages these genes usually variants in the gene control the gene itself So it's a self-controlled sissy QTL and then you can look for downstream effects So it's be fun to integrate it into Marvel at some level, so let me just go back and Play the last few slides So I didn't show you all the phenome data, but I Mentioned Mike Snyder as as the best phenotype human being the BXD's right now are the best phenotype population on in the known universe It's still a pathetically small amount of data given the amount of data you could get We have only a handful of proteomic databases for for liver and fat We are getting beautiful proteomics. So that's one thing Carolyn. You didn't mention we need proteomic data We need to get at the kind of cutting edge of the genome where where it hits the biology more effectively transcriptomes are great, but I Frankly, we've been waiting for proteomics to mature and it finally has So we have terrific phenome data cloning genes using This this method in the same way that Francis Collins said that it was going from traditional to traditional back in 1990 whenever he said that This is true now for QTL mapping. It's relatively trivial now to to clone genes using for genetics You know our group has probably done 20 to 30 at this point We need the same thing for rats and we're we're doing this with a lot of support from NIDA. So NIDA has spends about $200 million a year on rat analyses of drug abuse and it basically evaporates after a couple of years because there's no way to integrate that data very effectively other than publications so we're trying to provide the NIDA community with rat resources that allow integration using the same dialogue across so just some concluding marks remarks here Really we have to bring the communities together to make a case for the relevance of animal model work It's not a no-brainer anymore. I think we lived under the impression that it was I certainly did in 2000 2005 2010 I was kind of going hmm. I wonder Now now in 2019 I'm definitely wondering and what we've seen happen in in Europe is a dangerous precedent and we certainly don't want that to Spread across the Atlantic We have to make the case that this is really relevant to humans And I think if you were to ask clinicians, how are you going to predict the outcome for you you and you? At the age of four or five or six Given deep omics data to your ultimate risk for disease. It's it's a little better than a crapshoot But I think we can get there with animal models I haven't mentioned it But the dialogue cross works great with comped resources because you can make an F1 between 250 genetically diverse animals and your favorite allele So there's a very nice example From Catherine Kazarowski's group at the Jackson laboratory who crossed in the 5x fad Human alleles that were on black six into the bxd's did 2028 f1s between 5x fad and the bxd want bxd's and got just a wonderful Spread of disease susceptibility to those five alleles from in Prisa and Ellen one and APP And just thanks to a lot of people obviously that helped a lot of funding over the years from many agencies And I gms has been a very kind of support gene network and night has been great to support Kind of the biology there. Thanks Rob have you reached out to? Have you reached out to either? Clinical people that have a lot of phenotyping like I don't know Kaiser Or there's a bunch of other clinics that are now trying to really sort through some of that Yeah, we have any to so I mentioned very briefly to kind of whiz through it that we worked with the but the Vanderbilt group Josh Denny Josh's is one of the leaders of all of us program They they have as good an EHR data system as you can have for a large medical center But having said that it's it's a sorry business It's basically ICD-9 and ICD-10 codes in a good structure But you know it's it's not quantitative data with rare exceptions when we tried to conciliate with the mouse work It was like every one of our phenotypes by definition is quantitative other than code color So we have nothing but quantitative data and they have nothing but but ordinal or dichotomous data So it's really hard to bring those worlds together. We did it kind of sort of They're all kind of the beauty of the human data is has great resolution for the fee was you know You're down to a hundred hundred KB whereas us guys working with mouse populations are you know plus or minus one make a base So so it does work But it would be great to have rich human quantitative phenome data, and it'd be great to have richer Mouse rat phenome data. We basically don't have very little despite all of our chest beating about how great our data sets are So I might have a lead for you potentially So a human longevity Craig ventures Thing where he's trying to sequence a million people He does it very deep phenotyping at least rich people. Yeah, look So the rich people you know phenome is is actually Extant so we have that Potentially, so we should talk about that Rob great talk. I think as I think a lot of folks in this room can while we appreciate our approach Using a single genetic background. We're also very aware of the fact that this the limitations of that Which is the lack of context and so I was thinking you kind of alluded to it at the end And how we can use diverse resources to model perhaps large effect size Conditions, you know, I was thinking about the types of recommendations you might have and that sort of where would one start So imagine the scenario of your face with recessive likely loss of function disease gene Would you start with black six hope for the best and then maybe add in diversity because we So Huntington piece of cake right so And the 5x fad it's not formally a dominant, but but effectively I mean, it's not a dominant or all on black six. It has lousy. You probably know from Catherine's word I Don't really know how to address the issue of recessive You know the in elegant solution would be to make the f1 and then make a small f2, but I Don't like that because then you can't you don't have the advantage of isogenicity so my hope would be that a large number of of Quote recessive effects really have an additive effect that may not be true for the sorts of variants that you guys are looking at But typically if you're if you're a GWAS person or you're a quantitative genesis It's just everything basically boils down to an additive fact. That's why the GWAS people they never even talk about dominance, right? They don't even know how to compute it So so my hope is that that there would be a Sufficient additive signal that you could grab on to it You might have to tweak your phenotype and you might have to dive into an endophenotype So if you're studying schizophrenia, okay, maybe you have to study something like pre pulse inhibition where you get an endophenotype