 So, let's move into the discussion phase. So, I'm going to start off with taking off on something that Callum had mentioned, and there's a bit of a discussion around this this morning, and it's just something I hadn't stopped to think about carefully until your slide about the fact that we've got only 10 to the fourth clinical phenotypes. In your final slide, you said, okay, we need to come up with a minimal data set. So I'd like to have a little bit of a stark conversation around that, because a little bit of, I'm not sure how we solve that problem, since disease is defined around these phenotypes for better or for worse, that's what we have. So maybe you can talk a little bit more about how you see that going. Well, I think some of it is using the information we already have better, and it's worth pointing out that obviously we're sort of constrained by having to map everything back to existing data sets, but I think that itself is a constraint that we've allowed to occur. I think just to give you some simple examples, so for example, one of the things I started doing was asking our MRI guys to actually, on the scans of my patient, because I can't make it uniform yet, to include the John and the pharynx in the MRI, so it's about six extra slices, so that we can actually begin to use this as an integral of the phenotypes that might be part of outflow tract abnormalities that are subclinical, which are seen in every one of the animal models for all the diseases that affect the right ventricle. There are rarely penetrant phenotypes, for example, even in the case of NQ1 loss of function mice in the pharynx and in the jaw, as well as in the stomach, so things that we could begin to do just by changing the way that we think about standardizing the traits that we routinely measure, beginning to analyze existing data sets using non-subjective strategies. So electrocardiography, as I mentioned, everything, we have all of the fundamental signal noise characteristics are incorporated in every single EKG, yet we're still completely dependent on subjective terminology that was introduced in 1947 by a committee, and so actually saying, well, why don't we use machine learning to try and understand what we could actually pull out of the electrocardiogram would be a sort of tiny first step in that trait, but I think to get back to this morning's discussion, I think thinking about it from the standpoint of what are the extant data that we have for all of the pathways that we know and love for model organism data, how could we then begin to characterize cellular phenotypes or physiologic phenotypes, whatever resolution we can get from those data sets, and then say, okay, can we actually map, let's say, the entire psyllium onto a set of peripheral phenotypes that were recorded in everybody, you know? That's a real possibility. There's almost certainly with all of those, I mean, if you want to look at the gene list and you can see that it would be very tough to imagine that somebody who had a really powerful perturbation of that entire network has completely normal peripheral lymphocyte function as a starting point. Now, they might have a baseline, but if we find the right perturbation, maybe we would then be able to say this is somebody's entire psyllium network is perturbed in a fundamental way, gets back to some of the things we were talking about with Cricket's congenital heart disease sequencing effort. So I don't argue that it's a straightforward thing to do, but I think it requires an active effort as well as also an unbiased strategy, which I think will emerge as was mentioned earlier from patients. I think there's a huge movement in our patients and beyond to begin to collect data in a rigorous and quantifiable way that would actually impact this. So I think, as I said earlier, the scale of what we need to do here is so substantial that I think we need to think about all of those approaches and use them in a coordinated fashion. We'll do Mark, Rax, Dan, Liz, Bob. I know that Callum was being a bit hyperbolic in his presentation about saying, well, there's not a single datum in the electronic health record that I would use, because what you're talking about there is not only the standardization of the element but the standardization of how we measure the element. And I think that while it's true, and certainly if we look at blood pressure where there's sort of an accepted way that you're supposed to take a blood pressure that's at least in some sense reproducible, the number of times that that actually happens in clinical practice is minimal. However, if you have 70 blood pressure measurements across a variety of different ways of measurement that consistently show, you know, somebody with a blood pressure of 150 over 90, you can probably be relatively confident of at least getting some sort of a gross type of phenotype. So I think part of the thing is to say, out of those things that are represented on the electronic health record, is it a number of measurements? Is it measurements over time that would allow us a certain level of confidence to mine those elements to be able to do the type of work that we're doing? I agree. I think you need to standardize measurement and stimulus. And as you point out, a lot of this is just thinking carefully about what the dynamics are that you're going to time series instead of single cross-sectional data points. Those types of things are, as Les mentioned earlier, thinking about proximate phenotypes that are actually fundamentally definable in a way that's linear over five log orders. And by the way, you know what a panel of 100 drugs does to each one of them so that if you did that, you could imagine you'd probably record half of what you need from a single skin sensor. You know, that's the type of thing that I think is emerging in clinical phenotyping but has not been connected very directly with genomics. And ultimately, I think it needs to be. So when you said there were 10 to the fourth clinical phenotypes, is that because you're just looking at the number of ICD-9 or 10 codes? Or how did you come up with that number? So we looked at every single billable component of a trait that is, every single billable test, and then looked at every reported component of that test in clinical practice. But certainly as, certainly the Emerge Network, for example, has learned is you gather a large amount by data mining of problem lists and the patient and then the physician notes and all those sorts of things. And actually wanted to contrast that 10 to the fourth number. So HPL, how many terms are there in HPL? 1.1 times 10 to the fourth. They're the same ones. So Callum has touched on what I think is a really interesting and important part of this challenge that faces us today and tomorrow. And that's sort of, you know, we're obsessed by the genome and the transcriptome and the psyllium. And we haven't talked much about the phenome yet. And I'm a big believer in the idea that if we curate the phenome and organize it and write that it'll be as powerful if not more powerful an input to these sort of multi-omic analyses that we all think will yield useful information. So the problem is that I, like Rex, think that the electronic medical record is a place to do this. And the more granular the phenotype, the less likely it is that it's going to be captured by your average clinician using the EMR in the course of clinical practice. So you sort of have this paradox that you want them to record more information along the way, or you want some mechanism to record more information. And at the same time, they're under pressures by their chief of their clinical division, for example, to see more patients and not waste their time doing all the things that their chief of their clinical division wants them to do. So the phenome is hard. I can't resist following up also on Mark Williams' example that if you want to diagnose hypertension, my friends in bioinformatics. Oh, and by the way, my sign is wrong because I'm now a professor of bioinformatics as well as all the other titles I have. So now I'm an expert, Mark. If you want to diagnose hypertension, the worst metric to use is blood pressure because the well-controlled hypertensive has normal blood pressures. And the 25-year-old with a broken leg who doesn't have hypertension has a high blood pressure. So I think that there's this problem that we think the electronic medical record is a giant resource waiting to be mined, and we think we need more ways and more creative ways of mining it. I really like the idea of this idea of unsupervised machine learning to find new phenotypes, let the data talk to us instead of us telling the data what we want it to say. And hopefully we'll get, for example, clusters of hypertensives or clusters of type 2 diabetics or clusters of whatever that will then enable the kind of things that you want to do in model organisms. Same thing? Yeah. So one thing I really liked that I heard from Callum was the idea that we have certain aspects of the data that we collect that have a richness to them that is not captured in reports. The EKG data is one, and certainly imaging data. And there's plenty of information, or plenty of studies that have gone back and re-looked at these sorts of things. But again, I'm not aware of projects that have looked at this in an unsupervised machine learning way to try and do discovery. That would be something that would be really interesting to do, particularly if we tied it to some particular measure of interest, whether it be in the cardiac realm or neuroimaging or whatever. So I think there are projects, many of them funded by NIH, to try and do things like this. I think they are not at scale necessarily, and they've not been implemented in parallel with genomic efforts. But obviously, that's the direction, that's the reason for meetings like this, is how do these things come together in a way that actually is integrated around the shared angle? Bob, are you on the same topic or different? Okay, so Peter, did you want to jump in? Yeah, I mean, I think another thing to say is yes, there are now maybe 10 to the fourth phenotype. I think the HPO will probably land at about 20 or 30,000. But what hasn't been discussed yet are sort of clusters of phenotypes. And just as an example of how powerful this is, three groups have used human phenotype ontology and large cohorts of 10 to 40,000 patients, I think, and they'd first try to find binary new genes. So they compared ICD-9 codes failed, compared to individual HPO terms failed, and then they used machine learnings to co-find clusters of phenotypes that correlated with genotypes, and each of these groups discovered between, I think, four and six genes now. And I think if we start to apply this to electronic health records, well, you can also imagine temporal developments of phenotypes that in the combination would really also give us 10 to hundreds of possible phenotype clusters. So I mean, I think the phenotype landscape, even with the things that we're measuring now could be a lot richer than it is, and it's going to require a software engineering to actually get this data out of electronic medical records and put it into a data structure that bioinformaticians can compute over. That's not really happening much enough now. Liz, then Bob, then Cricket. And just to follow up on that point, one of the things that's gonna be important is to have the phenotype data reside within the EHR. The minute you move that data out of the EHR, start gathering it somewhere else. You lose that connection to the patient. You lose the ability to update it the next time you want to mine it. So we need to work out, of course, how to get that data collected better in the EHR, along with the genomic data. And that's what we have to mine. We don't want to put it all somewhere else because then you lose the ability to update it. Bob. So with respect to the LongQT, for example, just as an example, I think of the EKG abnormalities as a biomarker of the disorder. And even when you have that biomarker, you're not absolutely going to have a catastrophic event. So how do we, so I guess my question is, does the deep phenotyping of the model organisms permit you to explore the possibility that there are other biomarkers or other phenotypes which you can score in vivo in a human that might replace the classical biomarker and have a greater predictive value? Crooked. So I really like the electronic medical record, but I'm very worried about two aspects of it. The first is that increasingly we are using some platforms that allow, if I may be so blunt, dumbing down of the detail and smart phrases that allow as a busy clinician to insert phenotypic information without the level of traditional thoughtful consideration and nuance. And I think that if we're going to really have these phenotypes matter, it would be wonderful to have some input in terms of these prefab and expensive electronic medical record platforms. Alternatively, and I think one of our speakers this morning said this or perhaps last night on the plane, no one is more interested in the person, the patient than the patient him or herself. And the other way to do this would be to buttress that with an opportunity for patients to provide that data in a relatively well-defined platform that really could allow them, could allow more searchability than natural language processing. So I want to touch back on something that Bob said and tied into Cecilia's discussion. So one of the interesting parts about, so if your saturation mutagenesis for the congenital heart disease is right, that there's 272 potential genes that are going to play a role, that does kind of change the, that does kind of change some boundaries around on that. So I'm just curious as to Cricket, would you buy into the mouse as being a potential saturation mutagenesis that we're going to go screening for and looking to see if we can put some boundaries around this? Because I think to your point, Bob, if we're going to have alternatives, we have to put some definition around what does that mean in addition to these other biomarkers. And that's a potential way to do that. So I would only comment that Cecilia's work is recessive. And when we do the same kind of analysis of dominant, let alone polygenic, we get 400 dominance. So we've already up the ante to 600. And I think that mice are wonderful for being able to get homozygous variants. We all know that. And the phenotypes are expected to be more profound. But unfortunately, it's probably the lower estimate rather than the upper estimate. I don't know how much ENU can really hit every gene. I don't know that you've even considered if there's some genes that are ENU resistant. But in the course of human experience, I suspect anything could be hit. So it's a minimum of 600, maybe 250, 300 recesses, and at least 400 dominant. Liz? Just a quick question on that. Do you think the maximum is also close to 600 in the multigenic will be milder hits in those same genes? No one's looked at it. What's your guess? You know, in multi gen, I mean, so I actually believe there are polygenic contributors to congenital heart disease. And we know this from when you look at the offspring of people in whom we think there is a mutation. And it's either incomplete, dare I say the word, penetrance, or exposures or something, or there are more modifying genes that we even have begun to think about. And how penetrant those modifiers are, at what point do they become a component of a polygenic model? I don't have an answer. I'm sorry. Yeah, I'm just gonna say that it's possible that among some of the 400, maybe some of our genes as well represented. I don't know. I've never looked actually to be honest. I think that would be worth looking to see whether there's any overlap. But suffice it to say it's a larger number. But I think it's not an impossible number of genes, right? I mean, I think it's, if we put resources, it's definitely something that's doable. I mean, so I just wanna put that out there. I don't know if there's a will to do that, but I think that that is potentially feasible. It's not, we're not talking about 10,000 genes or even 1,000 genes. But if we're in the range of 500, I think it's potentially a feasible project that one can pursue. So I just wanna put that out there as something maybe NIH may have to think about on a higher level. Scott, I'm glad that we finally have gotten around to this topic of systems, genomics, and systems biology because I think for complex traits, it's very relevant. And I think it does, what I haven't heard yet, and I think it's worth discussing because I think it changes your perspective somewhat, is if it really is whatever, 200, 400, 600, whether it's even 100 for asthma or 200 for hypertension or whatever the disease is, we're talking about a different way of presenting this data clinically to clinicians. And if really the ultimate answer here is systems biology, it's not the question you raised in your talk, am I gonna put this variant in the chart? It's where does this variant sit relative to the other 200 variants? And what does the clinician really need to know to make an informed clinical decision about treatment with a patient? So I think it's, when you talk about the translational aspect of this, I think it changes the game because genetics is deterministic for sure in the Mendelian disorders, some percentage of complex traits, how determined it is depends on the penetrance, the pleotropy, et cetera, et cetera. But that having been said, there's gonna be a bunch of people where it's not gonna be deterministic. And so giving that information, it still comes back to what information are you gonna give people and how are you gonna give it to them and I'm not sure if you believe in systems biology, we haven't answered to that yet. So I was actually gonna comment that for congenital heart disease, it's one of the few areas where the genetic architecture is quite well studied. So there, the familial recurrence risks are the order of 70 to 90 fold. And in addition to that, there's also some reasonable population studies looking at clustering around dioxin plants, for example, for hypoplastic left heart. So both of those are actually consistent with a very unusual model, which is a very small number of genes in the individual patient. Once you get above 10, you're asymptotically approaching one single gene disorder, but with perhaps stochastic and environmental contributors. And so I suppose the question is, if we still aren't able to resolve to reduce to simplicity a field like that, how are we ever going to do it for fields where we just have no idea what the fundamental architecture of the traits really is? Carol? So I'm hoping, so Cecilia, you started to raise something in your talk and I wanna build on that and ask you and Kaelin to both address. So getting back to the basic theme about bridging between basic science and clinical science as it relates to phenotype. So describing phenotype in model systems, zebrafish, mouse, whatever, and translating those phenotypes into the human clinical phenotypes, whether they're in electronic health records or not. I would like your perspectives or your experience in sort of trying to do that. How easy, how difficult is that? Is that one of the addressable barriers in connecting basic science and clinical science and genomic medicine? Is that an opportunity here? I'm just interested in your experience and your thoughts on that. So we tried to do that by cross-referencing terminologies. So with the MPO, the mouse phenotype ontology terms, we really use clinical terms to describe these mouse phenotype. At the same time, we use the Fowler codes, that's the Boston Children's Hospital, Kinshaw Heart Disease Code, so that every animal we coded both ways. We use MPO terms as well as the Fowler codes that describe the phenotype. And our hope is that clinicians who may come searching for cross-referencing, whether to see the disease model, that they would be able to actually find what they're looking for. I mean, I think that that, I know for Kinshaw Heart Disease, that clearly can be done. And I would assume that across different phenotypes, there would be some parallels. So, but I don't think that's being done systematically. I think that's one of the problems. And I wonder whether MGI, when you curate information from lines, whether it's possible to have clinicians that are part of that process that could then provide curation of terminologies that are used clinically, then it becomes more clinically useful. I mean, so I'm very mouse centric, so obviously this is broader than that, but that's my own experience. So the short answer is yes, we have clinical geneticists on our advisory board, Gail being a good example that are helping us do that. But again, I think it's larger than there's a lot of work left to do, right? And probably some better ways to do it so that it is more systematic and scalable. And I should just mention, even amongst cardiologists, right, there's disagreement about terminologies. So there was this big effort, international effort to come up with a standardized terminology between European consortia and the US and Canada, and even now that's not been resolved. So I think the problem is much deeper because there's not even uniform agreement on the clinical side, and I think that's probably true in other fields as well. I think that's the subjective component to the EHR. I mean, so much of what, it's not that the EHR is a bad place to put stuff, it's just a function of what stuff you put there. It's worth pointing out that EHRs, as Crick had said, are designed around billing. There's one billing code for every single congenital heart disease, all of the above. There's no subset, it doesn't matter whether you have a cortician or a hyperplastic left heart, it's the same billing code. But so when we've tried to do this systematically in fish, we only did it in a couple of phenotypic areas. It's actually interesting, you can very quickly map out what goes from a 48-hour-old fish larva onto an adult human or into a pediatric human or an adolescent human. And in fact, that type of thing has, again, is something that we would need to systematically do as a community across all of the model organisms. Some of the work has already been done in some manner of organizing it. I think one of the nice things about functional genomics phenotypes is that they're immediately translatable. But there are other phenotypes that we will have to actually build assays for in humans or build assays for in model organisms. You can imagine that there are some things that are so well-conserved that it doesn't matter which organism you look at. On the other hand, we'd be foolish to try and model male pattern baldness in the zebrafish. So there are some obvious things, but there are some other things that are not quite so obvious that I don't think we would ever get round without having a systematic mapping effort along the lines of the ontologies that Peter and Melissa and others are driving for. Peter or Jupe? Yeah, I just wanted to, oh, sorry, I mentioned, we've heard a lot about prediction algorithms, machine learning. I don't think that's gonna work that well in the clinic. I think that there's a new area in translational bioinformatics that's explanatory algorithms. So why actually did the computer come to a certain prediction that all comes down to the difference of whether we're making expert systems, maybe like Watson, which I don't think are going to work in our lifetime, or systems for experts, and that's something which I don't think has been considered enough yet. Oh, Jupe, you may. We've been talking for a long time in this sort of community about how there must be sort of unrecognized structure within the EMR, there must be unrecognized patterns. I certainly have been saying that for a long time, but at the end of the day, there's very few examples of that. So that may speak to your point, or it may be that we're just not going about it the right way, or it may be that we need to stick more structured information into the electronic medical record so there's this whole push that we'll be seeing a more of a push to include sociocultural determinants of health in a systematic way. You go to the doctor and you fill out a questionnaire and you sort of tell them what your diet's been for the last week and how far from the grocery store do you live. Those kinds of things that are as important for health as the things that we've been talking about today. So it sounds like it should work, but it hasn't yet come to fruition. And maybe the data sets aren't big enough or maybe the computers aren't programmed right or maybe it's all just a hallucination. I hope not the latter. But it's also, I think it's also the way the data are entered into the HR. As I said, I mean I was not being hyperbolic. There is almost no single standardized, because you're the division director. So I wanted to come back to another one of your slides, Callum, which was around and it stays in line with the phenotyping that we're talking about and it comes down to longitudinal data. So one of the things I've seen in some of these complex cases that even if you think it's a single gene and you have all of this longitudinal data, sometimes the noise and that longitudinal data across all the different phenotypes makes it quite difficult to get at the phenotype itself. So what is the actual causal phenotype? For some of these syndromes, you see a lot of manifestations on that. So I want more data, because I think that's important. How do we put some structure around this in a way that we can get to what is the important data? Because I think that's a huge problem and as we get into the EHR information, that's where you start diluting out in many cases what is the most important information. So what are your ideas around that? I mean, I think it's exactly what we talked about this morning, it's proximate phenotypes that are at the resolution that allows us to connect them mechanistically to whatever the causes are. I think that's one of the problems with most of the phenotypes we measure are things that we decided to measure as a community based on historical happenstance. Blood pressure is because you can actually measure the height of a column of blood when you cut the carotid of a horse or whatever. I mean, that's literally why we pick it. There's so many different factors that go into that that it beggars description to be able to rationalize why we would then 300, 400 years later still be completely focused on it and have it be the only residual phenotype for which we are able to manage risk in the syndromes that present that have hypertension as a manifestation. So I think what we need to do is begin to start in the areas that we understand well and progressively standardize what we do and move out. This is partly why I think it can only happen in the context of existing healthcare. We're gonna have to get everybody to change over time. I don't think we're gonna have a parallel project of a scale that is sufficient to identify the major determinants of colon disease if we don't have it built into the electronic health record and the whole healthcare system, the whole healthcare acquisition and delivery system. And just even obviously one of the traits that I think is very important, which we have really never systematically studied, the drug responses. So why is it that decades after some of the early work in pharmacogenomics, there's still not even an effort to record first drug response or what the Delta is in a particular trait when you add a drug on a large scale across the population. We never do any of those things. There are no epidemiologic studies, for example, that incorporate a dynamic perturbation using the small molecule. So, and that's something that you could immediately take to yeast the following day. So I think that type of understanding what the anchors are. So genomics is a powerful anchor because the language goes all the way from top to bottom. We need phenotypes to do the same and many of those are by definition going to be at the cellular level and ideally the best way of thinking about them is to organize them around stimuli that can also be applied across every level. I think, I'm sure it's not the only way to do it. We have to get, the difficulty is getting from here to there as you point out. So, Calmy, that also mentioned next generation phenotyping. I think it was you. So what were you thinking in terms of what in your average patient you would like to have measured as sort of the next generation phenotyping in an EMR? So we've begun to do this in very low throughput in our clinic, thinking two things and sort of aggregating some of the conversations that you've heard this morning. That we need perturbations that the traditional clinical data input is relatively random and poorly organized. And so beginning to have patients come and actually spend some time in the clinic where they're actually undergoing systematic perturbation and measuring things that we would not traditionally measure. Some, for example, as Dan pointed out, some symptom scales, but also we started measuring, partly thanks to Cricket's insights from her work, started measuring neurological phenotypes in patients who came with cardiac disorders. And so we literally, there are lots of clinically available nerve conduction, chloride transit times that you can measure in literally a minute and a half as people are checking in and begin to use those to stratify clinical disorders. That's a small scale in already genotype populations. I don't know, and I think you'd have to have some sort of unbiased strategy to begin to imagine how this would be feasible across the entire population. But thinking about existing data sets that inform risk of subsequent disease, there's everything from grocery purchases, which are being recorded by every advertising agency on the planet at unbelievably granular resolution, which is a whole set of exposures that we've never really measured, all the way through to ambient phenotyping, using remote sensors for example, one of the things we're about to implement is a near-infrared camera that picks fiduciary points in the skeleton and so can map in a clinic waiting room gate abnormalities. So that type of thing, because it's associated with cardiomyopathy, makes a huge amount of sense to implement in a cardiomyopathy clinic. It's obviously not necessarily of such relevance in an OBGYN clinic, but if you imagine how the entire healthcare system could be turned towards this type of thing if it was properly insented, you might actually have a very powerful set of tools. I mean, one of the examples I always quote because it's trite and simplistic, is if you look at diabetes, it looks like a very complex trait. If you add one or two variables like height and shoe size, you can pull out the several of the major Mendelian forms because they're dwarfisms or gigantisms. So those are actually, because they're standardized measurements for which there are even international scales of cross-reference. You can begin to imagine if we'd recorded that type of information, we might actually have at least one standardized, one additional standardized phenotype in the electronic health record. So shoe size may be the first next generation phenotype. Yeah. Yeah. Gail. Yeah, I think the other thing we've heard it mentioned, some is that when we see a child, we would love to get measurements on parents. Now, some things I can do, but if they're gonna see a pediatric ophthalmologist, he ought to be able to examine the parents and getting echoes on parents can be incredibly problematic and part of it, a lot of it's financial, but it's also burden the hand. If you could do it right when they were there, because they're too busy to come back and do it on their own and all, I think that would help us sort out again some of the genetics because if you can't measure the trait in other members of the family, it's hard to know what to do if they have the variant or not. Just one other facet to add, not to make it more depressing, but most of our clinical research and much of our care and I know clinical decision making is biased because of this cognitive error. We are currently wedded to a phenotypic taxonomy of disease and phenotypic ascertainment of the people that we study, which means that we are biasing our understanding of the relationship between genomic variation and phenotype based on our presuppositions of what we think these diseases are. And the only way we will ever be able to do prediction from genomic variation is if we understand that full spectrum and compliment our phenotypic based research with genotypic ascertainment research is ascertained people by their genomic attribute and then ask the question, what is the spectrum of the phenotype that is associated with these variants and make the information go both ways and then we can begin to put the pieces of the puzzle together. And that our natural history too. Yes, absolutely. Bob and then Cricut. One of the types of information that you can get from the EHR that's fairly binary is what was actually done rather than what the result of what you did was, which can often is not binary, which is to say was an echocardiogram done? Was a referral to an endocrinologist done? Was a CBC done? Those are sort of binary questions that are basically ursats for what is the clinician thinking and what's the diagnostic pathway they're going down. It's not perfect, but it's something that is more computable than trying to read what the clinician wrote. So I just wanted to add to Les's thoughts because I think that this idea of longitudinal phenotyping with genotyping is absolutely essential. And I would only add that this to me is where academic health care centers belong, that we have longitudinal record. We have perhaps been the earliest and most invested adopters in having an electronic medical record that works for better or for worse. And we have experts in those disease phenotypes. And we have a cadre of sick people who come to us because we have experts in those phenotypes. So integrating the genotype in terms of those longitudinal medical records. And I think around the room, many of the institutions here represented here have created these biobanks that allow for broad-based genotyping. And I think that's something that should be endorsed and frankly supported by NIH because it's so powerful for having all of the information in one place about a disease that there is a focused expert cadre of investigators, scientists, PhD scientists, clinical scientists and the like to gather around. The only other thing I would add to it is sometimes we forget that as geneticists, we like extremes, okay? And we've heard a lot about Cecilia and our early extremes, but the flip side of that should be, we should think about the really healthy older cohort. You know, those individuals who get through all of this stuff and are alive and well and can tell us about it. And I think that it would be very worthwhile to have genotype and phenotyping of those individuals. Now, I frankly don't believe that anybody is completely healthy, but I'm struck. And I think all of you have relatives who have, as I do, individuals who have maybe bad diseases below the neck, but above the neck, they are as sharp as can be. And what does that try to teach us about what are the genetic drivers of diseases as well as phenotype and compensatory responses? So I just think that normal controls are something we never see supported at the NIH level. And I think increasingly we need normal, or if you will, healthy people that we really understand their genotypes and phenotypes. Mark? Ben Callum? Yeah, a couple of responses to that. First of all, and again, maybe the argument is whether or not Geisinger is an academic medical center. We don't consider ourselves to be an academic medical center, but I would take issue with academic medical centers being the best source of longitudinal data since the experience that we've had is that most of them have episodic data associated with more acute illness and that the care, the majority of care is being provided outside of academic centers. And related to that last comment is that we have individuals in our my code bio-repository who are between 90 and 100 years old that we have all their data from the time they were born until now. Now, granted, the 80% of it is not in a computable form, but it's there. So I think that's the power of having more of an embedded approach of associating genotypic information with a range of things. And to Les's point, I think we're beginning to see how a FIWAS type of an approach can yield some data, but it makes Callum's point even more important, which is if we're going to do a genome first after attainment, to be able to understand what the range of phenotypes is, we have to have way better phenotypes to be able to assess that. Because the advantage that you have with phenotyping is at least you kind of know what you're looking for. And so you can torture the data to give you something that is a reasonable proxy for what it is that you're actually looking for. Whereas if you're doing a genome first approach, you don't necessarily have the luxury of being able to understand what the electronic health record is actually telling you. So I think in some ways what we're saying is that these two aspects are inextricably linked if we want to move this forward. We have to be able to do both. Callum, and I was just going to emphasize in response to Krig's point that exactly if you have quantitative traits, you capture both ends of the spectrum, as well as all the information in the middle. This is fundamentally an information content problem. And I have to tell you that one other thing I would say is I think this will be solved outside traditional biomedicine long before we solve it. There are huge movements in quantified self-movement, in commercial health movements, and in commercial activities of other sorts where there's actually a vested interest in improving the information content on the individuals for a variety of different ends. So I think if we think about this is only going to happen in academic medicine. I mean, I agree with Krig that there's good that is in academic medicine. But I think fundamentally, this has to even from the standpoint of scale has to operate in a much different fashion. Steven, last comment? Oh, wow, last comment. I'll pay you back later. There is Chair Progat, I'm just saying. So I have to disagree somewhat with the tone of this conversation. In the studies which have been performed to date with really crude phenotypes coming out of unstructured EMRs with disparate, not necessarily academic clinicians, we're finding very high rates of diagnosis today and things which are clearly scalable. And so I agree that from a basic research standpoint, you want infinitely beautiful, precise phenotypes. But from the standpoint of implementing genomic medicine at scale, we have to understand that we're aiming to use conventional medical records, conventional imprecision measures, and that we don't understand the phenotypic diversity of most genetic diseases. And so precise phenotypes sometimes are our enemy and lead us up the wrong ballpark. So I think that even with several thousand phenotypic features, not even 10,000, and all of the imprecision in our medical records today, the field is still incredibly ripe for us to do the pivotal studies to start to measure phenotypic heterogeneity, variability, dynamics, topography, and I'll stop there. Well, I think that's a great close. I will add one other thing is that maybe genome 10, although we haven't got there yet, should be a meeting on what is a healthy genome.