 Thank you, Tira, and good morning, everyone. It's a pleasure to see you all here, and it's fun to be here to Talk about a topic that's near and dear to my heart. So I'm going to cover a fair amount of ground I'm glad to answer questions, but I guess most of them are usually at the end, but If I'm not making myself clear and you just can't wait go ahead So First of all my disclosures. I have no relevant financial relationships I Do a disclose that I'm a real enthusiast about Genetics, so I hope that that comes through to you. It's a wonderful field if you're thinking about going into it It underlies all of biology It is basically a hunting license to do whatever you want in biomedicine So I urge you to think about that career if you're at that stage of your your career So I'm going to talk about some features of Mendelian disease and then review the rapidly evolving field of clinical DNA sequencing And then I'm going to talk about this disease gene discovery Results and tools and I'll focus particularly on the Baylor Hopkins Center for Mendelian genomics Which is one of now fourth centers around the country that are charged with finding as the genes are responsible for as many Mendelian phenotypes as possible So we Customarily think of Mendelian disease as being quite rare and yet it is becoming increasingly prominent I see that this slide says this month. This is actually I think from January 2015 so I apologize for that error But the point is and any month if you look at all four issues of the New England Journal You will see a lot about Mendelian disease in this particular month. There were See here You can see that there were Can you hear me? Okay? You can see that, you know, there were typical Mendelian disorders with onset in childhood but look at here here's one that's adult coronary artery disease and If you looked in the editorial section there was even an article about ethical issues about screening for genetic monogenic genetic disease So there's a lot of interest About Mendelian disease throughout the biomedical community right now and we have I'll talk a minute about why that might be so Well, I'm gonna talk right now about why that might be so So first of all the genome project obviously provided a reference sequence. So that made finding the Relevant disease genes much easier Obviously the availability of new sequencing technology that dramatically decreases cost and increases throughput also gave us Many new avenues for finding genes responsible for Mendelian disease and Things like the HATMAP project and the thousand genome project gave us an appreciation of the extent of quotes normal human genetic variation not only in North America and Northern Europe, but from populations around the world. So that turns out to be a tremendous resource and lastly There has been the development of genomic and genetic strategies to identify responsible variants and genes so The first thing you might say is Well, when should I think of a Mendelian disorder if I'm a physician seeing a patient or Somewhere in the healthcare profession For many Mendelian disorders not all but for many the phenotype includes multiple systems that are not easily related one to another Many Mendelian disorders, but not all have relative relatively early age at onset often in the first decade of life There of course the recessive ones are of course increased with consanguineous unions and if you find multiple affected SIBs and or generations then obviously that's a pretty key Clue and if you think about it there's sort of an Inescapable rules of biology about how genes are transmitted from one generation to the next and we use those so-called Mendelian rules to really help us evaluate candidates for Mendelian disorders and it's one sort of Fundamental bedrock of genetics that whatever you find pretty much has to be put in this context So although we do think about Mendelian disorders as having their onset in childhood I would submit that there are many Mendelian disorders that present in adult age and That our colleagues in internal medicine that I confess I'm a pediatrician our colleagues in internal medicine Have to be more alert to the possibility of Mendelian disorders So I just want to make that point by presenting two families to you that we've seen in the last few years So the first is a man who was 34 years old and he presented to Johns Hopkins Hospital actually about two and a half years ago and He had a fever Ten-day history of pretty high fever really bad pharyngitis and he'd been treated by his Personal physician with antibiotic antibiotics actually two different antibiotics still febrile And so the physician for reasons not known to me Treated him with a large dose of steroids as well Following that intervention now ten days into his illness the man began to or now eight days into his illness the man began to develop a confusion and That led to him being taken to a local emergency room Where the doctors were smart enough to think about hyperammonemia and they measured his ammonia and it was 10 times normal 280 micro molar And he had a mild respiratory Alkylosis which in the presence of hyperammonemia suggests a urea cycle disorder because there's no accumulation of organic acids and ammonia is a Stimulant for the central respiratory centers So he was rapidly transferred to the Johns Hopkins Hospital medical intensive care unit by the time he arrived two hours later He was in the early stages of coma and a CT scan showed Mild a cerebral edema and his ammonia had already risen to 420 micro molar for those of you that are not physicians He had about One foot and maybe three toes in the grave at this point so he The emergency room docs or the Miku docs did their thing one of the things they did is they called Genetics and so I happened to be the attending and I went with one of our residents Hans Bjornsson to see this man, so we saw him about 20 minutes after he hit the Miku and So like any good geneticists one of the first questions we asked was well, what is the family history? We asked this of a fourth-year medical student who was involved in the case And he said what is the I think the most common response to that question, which is negative So I would submit One important take-home lesson from this lecture is unless the person is adopted and knows nothing about their family The family history is never negative You may have some pertinent negative results that help you eliminate certain things But the family history always tells you information But when you get the family history you have to get it and think at the same time and sometimes as you think You'll come up with new questions So you have to be willing to go back and forth with the family as new ideas new hypotheses for the diagnosis Enter your mind. So This is the information we got from the medical student. So I said What do you mean negative go out and ask the family for more detail the family was assembling in the Miku waiting room so he went out and He came back and it turned out the family was a little bit more extensive and so there here's the Second version of the family history So what you can see is that the pro band indicated by the red arrow Had a brother who died and he had two male Twins identical twins or no fraternal twins who also died now. It turns out the twins died in childbirth and Almost certainly had something else the brother that the medical student I said what do you mean negative and The brother the medical student said well not to worry the brother Died of drowning when he was 14 years old So what did I say? So I said well, why did a 14 year old boy drowned go back out there and find out so he went back out there and The the story was that the 14 year old brother was on an outward bound like experience and he developed Upper respiratory illness and was sick and then his campmates reported that he began to be confused and By confused they know at a time when he couldn't find his hiking boots and they were right in front of him and That night the night before he died They all went in into their tents to go to bed They were camped by a lakeside and in the morning when they woke up They found him floating off the end of the dock drowned So they theorized that he got up in the middle of the night in his confused state and walked out on the end of the dock and fell off and drowned and we actually got the Autopsy because he because it was an acute death. He had to be autopsied and out in West Virginia someplace and the local Corner said, you know the strange thing about this drowning is that I mean the boy clearly drowned But he had cerebral edema and that's the phenotype that you never see with Drounding because drowning takes place very quickly. I To cut to the chase we got a baby tooth of this boy And he also had the same urea cycle disorder that his brother presented with So it turns out that this is late onset ornithine transcarb amylase deficiency and the We've studied this molecularly one of our graduate students and Ted Han and Ted found a promoter variant never before seen in a four base highly conserved element that's important for binding of a particular a new hepatic a liver specific transcription factor So we theorize and he Ted actually show that it reduces the activity of OTC and Reporter assays and so forth so we theorize that this is a promoter mutation regulatory mutation that reduced the function of OTC That the both of these boys had enough OTC activity to get through Early years of their life, but under conditions of severe stress this genetic vulnerability was brought out and In both cases led to their death now, of course the geneticist in the room will say well it looks like the mother must be a carrier she's had two affected sons and We tested her and she was a carrier and then we tested her sister and she was also a carrier and We wanted to test these two boys each of whom is at a one and two risk of having this phenotype Both of them are young adults both of them are underachieving in comparison to their family This is a quite sophisticated family and both of them refused They live out in the Midwest and they both refused to be tested So I don't know what they have but I'm suspicious that they might have the same thing just based on their sort of performance So here's a mental Mendelian disease lurking in An adult patient The patient is just more vulnerable to particular particularly severe environmental stress namely this bad infection and a dose of steroids perhaps contributing to it Now in case you think that that's just a one-off example a few months later Hilary Vernon one of my colleagues Was asked to see this man who is a 54 year old man who presented to the cardiology clinic with severe dilated cardiomyopathy and they noticed that he seemed to have a Some features of early onset dementia And the theory going theory was that perhaps because his congestive heart failure was so bad that this may just be some Low-level chronic CNS insult, but they were worried about his b12 status and they sent Homocysteine level and the methamlonic acid level and both of them were elevated And it turns out that this man has a cobalamin C form of combined methamlonic acidemia and homocysteine area he died of his shortly thereafter he died of his Cardiac his congestive heart failure But it turns out his sister Also has this phenotype. She's also middle age. She's also Intellectually not doing as well as you might expect for the family So here's a another late onset Mendelian disorder. They're out there just have to look for him think about him So that's all I'm going to say about the prominence of Mendelian disorders now. I want to talk briefly about finding the responsible variants in genes so I Think probably everybody in the audience knows that geneticists human geneticists since there have been human geneticists have been interested in finding the genes and variants responsible for Mendelian phenotypes, so Archibald Garrett in 1902 Reported patients with al-captanuria and noticed that the distribution of affected individuals within families was entirely consistent with what Gregor Mendel had described in 1865 and that He hypothesized at that time that maybe al-captanuria disorder in tyrosine degradation Was in fact one of these Mendelian disorders that this monk Described 30 years earlier in p-plants And then the number of recognized human disorders began to grow and the geneticists once we understood that the Factors that were responsible were actually Encoded in the DNA we began to look for the genes and variants responsible and typically we used really A tedious strategies linkage with collecting large families and doing linkage analysis or searching for a chromosomal Aberration that pointed to a particular region in the genome where we might find that gene But things changed with the genome project as I said and with the development of next-generation sequencing And I refer you if you're interested in this to two papers The one on top in particular is really a seminal paper in this field. So this was a paper from Colleagues at the University of Washington most notably a Mike Bamshed Debbie Nickerson and Jay Shenduri and they were working on the development of So-called next-generation sequencing and and genomic studying the human genome and they did a simple experiment really But it's very elegant. They said, you know, we're able now to sequence the genome and particularly the exome Which is about 1.5% of the total genome the exome being the coding sequences the protein coding sequences We're able to sequence that pretty well and we have this reference So what would be the chance that we if we get a patient with a Particular Mendelian disorder we could simply do a whole exome sequence and recognize the variants or variants that were responsible for For the phenotype So that sounds like a straightforward hypothesis But the problem is when you do a whole exome sequence or the problem is Fundamentally that each of us differed by about three million single nucleotide variants from the reference genome So you have to find which of those three million variants Which single usually of the three million variants is really responsible for the phenotype now if you focus on the exome You cut that number way down maybe 25,000 variants from the reference sequence in someone's exome But you're still a long ways from figuring out what the responsible gene is so what they did and I'm not going to dwell on it But they they took a set of patients with a very well characterized Mendelian phenotype namely Freeman Sheldon syndrome the disease gene was already known Myh3 and they said let's sequence one patient with Fima Sheldon syndrome and see if we can find the variants and They they found actually they looked only for severe loss of function variants indel's splice site changes and nonsense mutations and Sequencing that one patient they had several hundred variants that might be candidates for this particular disease So then they said well, okay, let's get another unrelated patient And we'll do the same thing and we'll look for genes that are affected in both of these two unrelated individuals The hypothesis being since we see that they both have Freeman Sheldon syndrome They should have a variant in the same mutation notice they left out the problem of locus heterogeneity Which would have killed this experiment, but they did very careful clinical phenotyping So they they sequenced the second one and they looked only for genes that had a loss of function variant in both individuals And I forget the exact numbers, but they were down to about a hundred Genes at that point then they did well that looks good. Let's do another one They did another one and they were down to I think something like seven or eight genes And they did a fourth and they only there was only one gene that had lost the function variants in all four individuals And that was my H3 the gene that they already knew that was responsible for this phenotype So that said Unambiguously that you could use genomic technology based on next-generation sequencing and what we know about the reference human genome To find the variants responsible for human disease and you don't have to do big Timely linkage studies or anything like that You just have to find some well characterized patients and sequence those patients either as Singletons in their family or depending on the inheritance modes You might want to take a few other people from the family and use those Mendelian segregation rules to help you Sort through the variants as well as comparing patients one family to the next so that said okay guys. This is a new age Let's go get them We did a paper shortly thereafter Which I like to think contributed a little bit to this effort and that's the reference below and For those of you that are students in the room. I think this is a very illustrative example We were had a speaker at Hopkins David Goldstein a great Human geneticist and he was having lunch with the students as we often as often happens And he said he was working on whole genome sequencing in this case And he was looking to see if he could solve an unsolved Mendelian disorder using whole genome sequence sequencing Now, you know many of us myself included if we were sitting around the lunch table and we heard that We would say great, and then we would forget it a couple hours later, and that would be the end of it fortunately not a sobriety Author on this paper who was at that time a human genetics graduate student is quite persistent and Two days later she called up David Goldstein She said she had a family and she sent him the DNA So she did the family was provided by Julie Hooverfong one of my clinical colleagues and the family had something called metacondromatosis and David did the whole genome sequence in about two and a half weeks actually now if you do whole genome sequence as I said You're gonna find three million single nucleotide variants compared to the reference sequence and you'll find some structural variants as well So we said wow, this is really a difficult problem What can we do to help us and so the only reason I've mentioned this paper is because we Then went back to genetics. So this first paper is all genomics. We use genetics. We said, okay We actually this family was not big enough to do Convincing linkage analysis that is to find an region of the genome that unambiguously Harvard the Responsible variant but recall that linkage is actually very powerful at eliminating regions of the genome that can't possibly Have the thing have the cause of gene. So we did some quick snip nucleotide Linkage panels on a few other family members very cheap compared to whole genome sequencing and we looked we quickly found six regions of the genome that could potentially harbor the Responsible gene so certainly we hadn't narrowed it down dramatically, but actually those six regions only comprised two percent of the whole genome So we were eliminated 98% of the genome using that simple genetic trick So I think of this as combining genomics with classical genetics and sure enough under the second linkage peak that we looked at There was the responsible gene with an unambiguous loss of function Mutation and we were able to find another family with the same phenotype that had a nonsense mutation in the same gene Ptpn11 so Qed and that whole exercise took about six weeks. So at that time that was going pretty fast So genomics and particularly genomics combined with genetics offers powerful reagents or tools to get at these disorders these genes Okay, so that was a few years ago and With that sort of stimulus and because of all the other reasons that I've already enumerated One of the things that's going on in the last few years is what I call the rise of clinical DNA sequencing So those of you that see patients know that increasingly it's possible to use molecular diagnostic tools To make to search for a precise molecular diagnosis in your patient So I just want to review that because I find that people don't really Have not really thought through all of the approaches and what they mean So I organize sequencing clinical sequencing by target So the first is a very focused search and that's a single disease gene think BRCA one And so you have a patient who let's say has breast cancer Maybe a positive family history and you want to find out if that patient has breast cancer Because they have a pathological variant in BRCA one. So you look at that single gene one of 20,000 genes Now a second strategy is what's come to be called a disease gene panel I mentioned the cardiomyopathy patient So we know of on the order of 25 to 30 genes that when They that when certain variants occur in those genes the patient will present at different age ranges with dilated cardiomyopathy so there's a Several panels that one can send such patients DNA samples and get tested for all of those 25 or 30 genes So it's a collection of genes each known to be responsible for particular disease And you're asking which of these genes if any is responsible for my patients problem Then a whole exome sequencing I've already referred to this sequencing the entire exome together with a splice sites flanking each exon and we By back of the envelope calculations, which I believe have withstood the test of time Estimated early on that whole exome sequencing that about 85 percent of Mendelian variants Would be found in the exome and in the flanking Entronic splice sites And I won't go into how that comes, but there's actually fairly good evidence for that So this is a pretty You essentially only have to sequence one and a half percent of the genome But you have a very high chance of finding the genes that are responsible for your patient's problem And then there's whole genome sequencing that I also mentioned the sequencing the entire gene genome exons introns regulatory sequences You know 1.5 percent of the genome is exome If you look at what fraction of the genome is highly conserved evolutionarily It's about five to ten percent maybe seven percent That means that evolution seems to really care about seven percent of the genome So you're still sequencing a lot of the genome that perhaps is not really very important when you do a whole genome sequence and obviously we're much much much less sophisticated and Interpreting the results of variants that we discover in the non-coding part of the genome as compared to the coding part of the genome So let me diverge briefly to just make sure everyone's clear on the difference between clinical and research whole exome sequencing So research whole exome sequencing typically you have a clinical diagnosis, but you don't know what gene is responsible and You want to find that gene for this particular phenotype the gene that's responsible for this particular phenotype So you typically sequence multiple members of a family maybe two affected and one unaffected or maybe the pro band and the two parents Depending on the the inheritance model and what samples are available Speed is typically not that critical. So it may be months going on here surveys all 20 or 21,000 protein coding genes it requires Validation once you find some candidate genes and variants and we do that validation by segregation within the family to the extent that we have family members and depending on what we think the men the inheritance pattern is and functional studies of the Candidate genes to make sure that the variants do what we think they do Then there's clinical whole exome sequencing and there are a number now of Commercial company companies that provide this service so typically a Physician sees a patient in a clinic and doesn't know the diagnosis and says I'm Rather than sort of spend a lot of time working up this Doing various sort of classical workups. I'm just gonna send the clinical whole exome test and see what this tells me and You typically send the patient and for some of the Commercial operations you also send the parents or some other family member and they may do the Pro bands clinical whole exome and then if they find something that looks interesting They may look at that particular variant in other members of the family or they may not And the key this is the key point for most clinical Whole exome services The genes they look at are the known disease genes. So in other words right now, I'll show you later They're about 3500 known disease genes out of the 20,000 in your genome. So those although they sequenced the whole exome They focus on those known disease genes. So in that sense the efforts for the Mendelian centers and other investigators doing research and finding and validating disease genes through research whole exomes then provide the knowledge for the Commercial clinical services to to offer their services. So I didn't make a slide of it But I was looking at a company in Europe Last night on their website and it says on the website you open it up and it says clinical whole exome sequencing So I read that and I think okay. They're surveying 20,000 genes, but then they say we give you results on 2,800 genes, so that's they're really only giving you results on maybe 10 or 15 percent of the genes in the genome Now eventually as the research Progresses they'll give a higher and higher fraction, but that's the relationship between research whole exome sequencing and clinical whole exome sequencing Now one Two two other things that you have to be clear on when you order these kinds of tests There are some unanticipated unanticipated or if you think about it. They're actually anticipated Consequences of large-scale DNA sequencing the first is our state of knowledge right now is imperfect So you will find you will absolutely depend if you cast a broad enough net You will absolutely find variants that you're not sure how to interpret and they've come to be called variants of unknown Significance or V us's and I'll tell you how many you find in a minute and then you also may find incidental findings of great medical consequence, so you may you know be Let's say doing a clinical whole exome on a child who has some Developmental defect or something like that and so you want to find the gene that is responsible for the developmental defect And you do that whole exome sequencing and you find a well known Pathological variant in BRCA one now when the family gave permission to do that test They were not thinking about BRC one BRCA one and that variant almost certainly had nothing to do with why the Exome was ordered, but now you have a piece of information that may be very relevant to that individual's long-term long-term medical care and That variant may also be in other members of the family so you found out some information not only about your patient But also about family members so the best way to deal with this possibility is To discuss it with the family before you you send the test so that everybody has got their eyes wide open about what you're doing Medicine has always picked up incidental findings You know you send you think the patient is anemic you send a CBC and you discover that they have leukemia or something like that, but What's different about these findings is that they may predict illness Way down the road that have absolutely nothing to do with why you ordered the test and they also may Provide information that's relevant to other family members who are not even your patients Okay, so let's look back at those four cast classes of sequencing approaches so Starting on the first row up here Single gene testing BRCA one already mentioned the cost several hundred dollars to a few thousand actually a few thousand It's it's less expensive or it's relatively inexpensive if you're correct That is maybe you spend two thousand five hundred dollars, but you get the answer You have fewer variants of unknown significance because you're only looking at the variants in a particular gene And very often those genes have been pretty well studied So you find relatively small numbers you'll find occasional but relatively small numbers of variants of unknown significance And no incidental findings because you're only thinking about this particular gene Now the second category is Some sort of disease gene panel. I mentioned cardiomyopathy maybe 25 Depending on when you did the test the numbers going up Cost is quite similar actually several hundred to a few thousand dollars. It's a broader net it's less expensive on a per gene basis and But you will find more variants of unknown significance You won't find incidental findings because you're really just looking at the cardiomyopathy genes. You're not looking beyond that Now what about a whole exome sequence a so-called clinical whole exome sequence so Currently you can get them for around five thousand dollars. It's a much broader net A bargain on the per gene basis, right? It's great But you will find absolutely many variants of unknown significance so you'll need to counsel the family about those variants of unknown significance or you will have to build in some approach that you've agreed beforehand to set those aside and And You'll find incidental findings I think most groups now are reporting if you just consider these so-called 56 American College of Medical Genetics genes where a panel of experts decided that we that there were Reportable and actionable incidental findings And you say how often do you find variants in those 56 genes which seem to be significant? Most people who are doing a lot of whole exome sequencing are finding on the order of one to three percent of the people They do whole exome sequencing on will have incidental findings in that small number of 56 very solidly known disease genes and Then a whole genome sequence Largely a research tool at this time, but several companies are beginning to suggest it More expensive It's a broader net still it's the broadest net we can currently cast although RNA seek will be coming down the pike and It's much much much harder to interpret You will find variants of unknown significance and incidental findings galore so One take-home message is that if you're going to use this outside of the research setting You should we think build in a good bit of genetic counseling time for those subjects that have this to explain all this stuff now what is In as I've indicated clinical particularly clinical whole exomes panels genes in clinical whole exome is a growing field So what have been the outcome? So we're beginning to see publications now that are looking to see what has been the consequence of this So the first publication I think any of any size was from Baylor College of Medicine that very quickly opened a commercial lab associated with their genetics group to provide clinical whole exome sequencing so they're reported in this reference on the first 2000 samples they did 88% were in the pediatric age range They made a molecular diagnosis in 25% of these patients. So that's a pretty good return on a diagnostic test rate 25% And interestingly 58% of the diagnostic mutations had not previously been reported that is to say You they found a loss of function allele in a gene that was known to cause a phenotype when it had loss of function And so this is just a new loss of function variant in this known disease gene The frequency of the end of the various inheritance patterns are shown there for the solved cases a key thing is that 30% of the diagnoses Involved a disease gene that was identified in the last three years. So this gets back to this The research community particularly research whole exomes pumping in new disease genes and those new New disease genes then can add to the list of genes that the clinical West can interpret accurately So it's really going up like a rocket right now And one interesting feature which has been found over and over again now is that 23 of the patients for which they got an answer or 4.6% Actually had what they called a blended phenotype from two different Mendelian disorders So in medicine we're taught, you know It's a sort of an Occam's razor approach and you're trying always to find a diagnosis that will explain everything about your patient So one of the reasons that these patients were difficult to diagnose is because they actually had two diseases two rare diseases in one And the phenotype had features of both of these disorders and so clinical geneticists were not able to recognize what it was So very interesting Now Gene DX another private laboratory service here in Rockville, Maryland does excellent work Very shortly thereafter reported 3040 consecutive pro bands nearly all in the pediatric age range. They made a molecular diagnosis in 851 or 28.8% roughly the same as the Baylor lab had found and again 28 of the patients or 3.3% had two or three Mendelian disorders and this graph which I won't say much about but shows the Test yield in terms of percentage of positive results by the particular systems that were involved. So actually the highest system is Hearing loss which has already known to have a huge contribution of genetic causation to isolated hearing loss So those two studies are largely Pediatric Baylor recently reported 486 consecutive adult patients 18 or older and they made a molecular diagnosis a little bit younger a little bit less in this older group 17.5% and they found 6 or 7% with two disorders and this graph shows the diagnostic rate with the age of the patient in years So the older the patient got the less chance. They had a finding a straightforward Mendelian disorder and This is a plot much like the gene DX plot and it shows the success rate by indication and the overall diagnostic rate of 17.5% so even an adult population at least that young and middle-aged adults suspected of having a Mendelian disease This turns out to be a very high high yield a diagnostic service Now for those of you that are not physicians in the room. Let me just emphasize some values for having a precise diagnosis so Physicians are trying to explain the phenotype of the clinical problem of their patients so they can Have a continuous diagnostic work up until they get the answer So this stops that diagnostic worker work up it shortcuts it It doesn't it ends the uncertainty of the diagnostic odyssey. This is the term that's been given to families or patients that Keep coming back to medical attention and trying over and over again to find out what in the world is their problem It turns out that if you have a child with a problem or you yourself have a problem There's a for most affected individuals. There's a strong urge to find out exactly what you have And that you're not when you go to your doctor and say i've got this problem that problem You're not crazy. You actually have some problem And it provides a biological explanation for the problem So over and over again those of you that have been to a genetics clinic If you talk to parents who have a child with some genetic disorder The parents will say things like Well, you know, I thought actually, you know three months into this pregnancy. I fell on the ice. I took a bad fall and I always thought that the reason that my baby was had this problem was because I fell down And you say no Actually, this is a straightforward genetic disease the fact that you fell down or that you Had a glass of wine or you had a cold or something like that Is irrelevant to this problem It puts the focus on patient management And I mean it focuses the patient management Now you know what you're dealing with and so you can draw from experience with other people with that problem And it informs the family of the recurrence risk in other words You know, if it's a recessive disorder, they have a one in four 25 chance of having another and I certainly have been in the I've had the unfortunate experience I remember a case of hurler syndrome, which is a very high burden. This is almost storage disease Patient was referred relatively late. So the patient was about 18 months old And the family came in with this 18 month old boy that from down the hall you could tell had hurler syndrome But they had a three month old child Sitting on the mother's Knee and I could tell that that three month old child also probably had that disorder And they've now both of those kids have now died But if the diagnosis has been made quickly and the family informed then they would not have had to go through Six or eight years of very high burden chronic illness with those two kids. So that's a big benefit So I'll give you this one example This is a patient that I've been following for 36 years He's 39 right now In fact, I'm scheduled to see him two weeks from now He had recurrent episodes of lactic acidosis from early childhood He had diminished intellectual function for his family with an IQ of 65 and cortical atrophy on his cns imaging studies he had mild to moderate cardiomyopathy and he had Prominent dysfunction of his autonomic nervous system constipation postural hypotension other such things as that And and he would come in with these episodes of recurrent lactic acidosis We would say over and over again This is something is wrong with the function of your mitochondria. This is some sort of mitochondrial dysfunction But we're not sure what it is Um several years ago, uh, we finally were able to get money together to do to sequence his mitochondrial genome And I told the mother that you know His problem if his problem was as I suspected mitochondrial it could either be in the mitochondrial genome or the nuclear genome At least we could check out the mitochondrial genome And it turned out to be normal. So I had to go back to her and say Well, the mitochondrial genome is normal. So I'm thinking it's probably a mitochondrial gene that encodes the mitochondrial protein in the nuclear genome At that point it was out of the question Not only for that family, but just in general to the sequence. Let's say a whole exome But eventually about two years ago We uh She got financial resources and insurance to actually pay for a clinical whole exome sequence and he has a Homozygous nonsense mutation a gene called fbxl4 never heard of it before until the test was done But it is a Previously described three or four other patients mitochondrial dna depletion syndrome The encoded protein is necessary for proper Replication of mitochondrial dna And so if you lack that protein your Mitochondria don't have as many mitochondrial genomes as they should the end result is your mitochondria don't work well So I had the pleasure actually of telling the mother that after 36 years I finally had a diagnosis The mother was incredibly relieved actually to know exactly what this is I couldn't I said, you know, I don't Uh, I I really don't there's nothing I can do about this So it's it's not that it's going to lead to a better treatment Maybe down the road it will but not right now But at least we know and the relief of just having the knowledge Uh of exactly what was the etiology of this boy's problem was palpable into the for this woman Really amazing Okay So then I want to turn to one other publication about clinical whole exome sequencing Which just came out it's a prospective evaluation Of whole exome sequencing as a first tier molecular test in infants with suspected monogenic disorders It's from the murdoch institute in australia. That's the first author in the reference and they uh, they did some sort of thoughtful modifications of the sort of Rather than the sort of shotgun whole exome clinical whole exome Diagnostic testing so they considered using this test in 119 infants unrelated infants That met a set of criteria. They had a well-defined phenotype Some of them had a positive family history and so forth Of those 119 families 80 agreed to participate They did a single clinical Whole exome sequence that is they didn't do any other family members and they examined in that clinical exome 2830 of the 20,000 genes And uh, they excluded to get rid of the problem with late onset incidental findings They said we're not going to look at those we're not going to look at certain genes that have those incidental findings So they excluded 122 genes They didn't analyze those genes 40 they uh of the 80 infants that were sequenced 46 or 57 percent yielded a molecular precise molecular diagnosis And of these of the 46 32 percent had a significant management change based on this new diagnostic information So it turned out to be of quite Important medical significance to about a third of the patients at this point followed for a few months or a year And additionally 28 couples 28 of the 80 couples that participated received a high Either 25 or 50 recurrence rates so they could use that information to avoid the scenario that I discussed earlier So, uh, it will be interesting to follow these studies now and to ask what does this mean for the sort of medical economic Issues was this initial investment in a rather expensive test? Does it not only improve the medical care, but does it does it reduce medical, uh, cost as the families go forward? I suspect strongly that it will Uh, but some Medical economic economics experts need to look at this in quite in in long in detail So that we can get these data. We desperately need those data Okay, so that's all I'm going to say about clinical sequencing, uh, and its, uh, value and its, uh, Aspects that need to be managed carefully if you're going to use it in your clinic or with your patients but the growth of The ability to detect Mendelian disease genes and the value of detecting them led, uh, the genome institute to, uh issue an rfa To develop centers for mendelian, uh genomics that would use the technologies that I talked to genomics and genetics to try to find as many Genes responsible for mendelian disorders as possible And in the initial four-year funding period of three centers were funded Udub at university of washington and seattle debbie nickerson and mike bamshad The pi's, uh, yell With the pi ricklifton and we partnered with baler college of medicine to form what we call the baler Hopkins center for mendelian genomics And I'm a pi along with jim lupski down at baler. So it's a real team effort Uh, and there's our website mendelian genomics.org And you'll see me refer to it as bh cmg baler hopkins center for mendelian genomic We just started our second four-year funding period and I just was at a meeting Monday and tuesday of this week. We were sort of Tooling up again for the next four-year run at this So, uh, it's interesting to say well, what is the current state of the art? So we keep track of how many many mendelian disease genes have been identified By using the data in online mendelian inheritance in man or omem, which was started by my colleague now deceased victor mccusick and currently managed by my colleague adahamish at hopkins and Currently omem as of late last night lists about 7,500 mendelian phenotypes um It lists 3,543 disease genes That's about 18 of a total You'll notice that the number of phenotypes is greater than the number of disease genes. That's because um in part that um That well the next column is explained phenotypes 5,722 that number is bigger than 3,543 and that's because some disease genes cause Two different or sometimes more discrete phenotypes that clinically we would have never imagined were caused by mutations in the same gene There are some genes a lamin a for example that account for 13 or 14 discrete clinical phenotypes So the average is about 1.8 phenotypes per disease gene right now And there's still 1800 explained unexplained phenotypes in omem and you have to realize that there are new phenotypes coming into omem All the time they come in at a rate of about 300 new phenotypes per year so There's lots of Mendelian disease out there that we have not yet recognized as being Mendelian disease Or we've not given a name to or an omem number yet So 18 of the total genome number of genes in the genome have been tagged as Mendelian disease genes So we have a long way to go If uh, depending on your view of how many genes in the genome can cause a Mendelian phenotype So let me talk about that for a minute. How many Mendelian disease genes are there in our genome and how close is that 3,500 to saturation So first of all, how would I define a Mendelian disease gene? And I would define it as those genes in which some fraction of variance in that gene Produce highly penetrant phenotypes. That's sort of genetics speak Penetrance means that you manifest a phenotype when you have the genetic variant And if I ask my colleagues, how high does the penetrance have to be to call something a Mendelian disease? There's no unanimity So I arbitrarily take the pen set the penetrance level at 0.7 So that means that if you have the variant the genetic variant your chance of getting the phenotype Is 70 percent or better? For example, the standard, uh disease Variants in brca one many of them have penetrances in the range of 70 So that means you're highly likely to get the phenotype, but it also means that some people won't And the ones that don't get the phenotype geneticists refer to as non-penetrant. We'll talk more about that in a minute so With that sort of background then you say well one way I might be able to get at how many Mendelian disease genes There are is to count the phenotypes Well, it's it turns out it's a lot harder to count phenotypes than it is to count genes So as I said omim currently lists 7,500 with about 1.8 phenotypes per disease gene And 1800 unexplained unexplained phenotypes, so that predicts maybe 900 more disease genes a pretty small number actually But we know that many phenotypes are conditional and dependent on Environmental variables so think of g6pd deficiency people with g6pd deficiency are Typically entirely asymptomatic unless they happen to chow down on a plate of fava beans in which case they will have massive homolysis and become jaundiced and perhaps severely anemic So that's you have a we all think that g6pd is a Mendelian disease But if you avoid all of the environmental triggers that cause that homolysis, you'll never know that you have that Mendelian phenotype There are many other phenotypes of this nature So the point is that to define all Mendelian disease genes and all variants that cause in those genes that cause Mendelian disease you have to Sort of challenge the population with a variety of environmental triggers to see what brings out the clinical phenotype Easy to do in a mouse a little bit harder to do in a person And the other thing is that there are a vast number of unrecognized phenotypes Remember I said that 300 come in new phenotypes come into omim each year They're not obviously they're not new phenotypes. They've been there all along We're just recognizing them and getting them into medical attention And they're vast swaths of the population of homosapiens around the world that don't even sort of get access to this kind of service. So I've recently visited the Middle East and I saw my host showed me just one family after another that had Genetic things that I had never seen before but clearly based on the Mendelian segregation in the family were clearly Mendelian So they're just waiting to be explained so There's sort of two Schools of thought about how many Mendelian disease genes are in the genome Here's one that says that the number of Genes in the genome that when they have a certain variant could cause a highly penetrant phenotype Is substantial but limited. So let's say arbitrarily here. I put it 30 percent Now there's another school of thought that says actually if you look carefully enough and across The whole the entire population of homosapiens you'll find that A large fraction 90 percent or more of genes can produce a Mendelian phenotype when they have a particular class of variants in that gene And um the answer to this question is not no Okay, I'm Obviously, I guess you would you would predict this. I'm greatly in favor of the red curve I think if we look carefully enough and long enough We'll find Mendelian phenotypes for almost every gene in the genome Now, let me just give you a couple reasons why I think that's true So the biggest one is this It's evolutionary thinking if those genes are not important for something Evolution would get rid of them right there's constant mutation rate Uh All dna segments in the genome Accumulate mutation and if they that mutation occurs in genes with important function then selection eliminates them So the set of genes that we have right now have stood the test of time evolution by evolutionary guidelines and so It's true that some of them may have been More valuable in earlier socioeconomic cultural conditions of our species But nevertheless the vast majority of them are there and because evolution cares about them So well, so then you could just ask well, okay Valley if you think the fraction of Mendelian genes is so large. Why are they why are why are they so difficult to identify? So the first answer that most people give is well, maybe a substantial fraction of the genes in our genome are so important Uh, that when there is a significant variation in those genes, it leads to early onset developmental lethals And so those fetuses are uh, only known in terms of spontaneous abortions And it is true that there are a fraction of our genes in the genome that are very very highly conserved Uh, and that suggest by very very highly conserved I mean the nucleotide sequence and the amino acid sequence of the ingota protein are Highly highly conserved and that suggests that they are intolerant to variation um And uh, we know that our species has a high frequency of spontaneous first trimester of spontaneous abortions A large fraction of those are chromosomal abnormalities But there are other uh spontaneous first trimester abortions in which the karyotype Appears normal. So why what's going on there? So how many of those might be Mendelian disorders that, uh, affect some gene That is absolutely important for early embryonic development That remains to be seen We also know that 30 percent and this statistic is often used 30 percent of the gene mouse of the genes in the mouse genome when, uh Uh made homozygous for a nullally a true knockout lead to uh spontaneous Uh, well fetal losses, okay, either perinatal or earlier in in embryos. So that says Indeed a fraction of the mouse genes, um, 30 percent of the mouse genes are are Absolutely necessary for normal development. So you're not going to see So then the logic is well, you won't see those Genes causing medical problems in later life So, uh, I would argue that's not the case because we know That every gene we've looked at there's a spectrum of Mutational events from those mutations That cause a complete loss of function to those mutations that moderately decrease function to those mutations that only mildly decrease function so Somewhere in that spectrum of functional consequence, there will be some alleles for these very genes That, uh, only reduce the function of the protein product by some fraction And that allows for, uh, successful At least viable in utero development and then will make itself known. Um Either an infancy or later in life depending on how, uh, the biology of that gene and the severity of that mutation So, uh, you know, we used to say for example, uh, that, um Rett syndrome is only seen in Females and, uh, that's it's an x-link gene and, uh, that it must be a developmental lethal for males But once the gene was cloned we did find a small number of males that survive embryonic development and have mutations in m e c p 2 So those are variants that are hypomorphs that make it to extra year in life So I think that this, um Mouse knockout experience and the human knockout experience will show us genes that are really critical But it doesn't mean that they won't present with Mendelian disease depending on the allele Now another reason that they're difficult Mendelian disease genes are difficult to identify is that our phenotyping is incomplete and or insensitive So, uh, the phenotyping In humans is largely a standard medical exam And then if it's a research project we may do some other kind of fancy or testing But, um We often basically that phenotyping is routine phenotyping or maybe what I would call uninformed phenotyping You're not thinking about a particular system when you do the phenotyping And it'd be better to have directed phenotyping That is where we're thinking about particular biological systems that might account for this patient's problem Or iterative even better or in addition to iterative phenotyping where we go back to the patient over and over again As we learn more about their condition and look for more subtle abnormalities Another problem with phenotyping is that we have technological limitations. We only measure certain things And their big whole systems biological systems of unequivocal importance that we don't mention that we don't really measure So the one I like to think of is a protein turnover by ubiquitination If you're in in the clinic, you can't send a ubiquitin level You don't look at ubiquitinated proteins You don't really we don't really assess the protein turnover pathways We know now from whole genomes or whole exome sequencing or other genomic approaches that mutations in those pathways do cause disease For example, certain forms of Parkinsonism so in contrast to serum sodium or Liver enzymes, we just don't measure that biological system very well. So if we're not measuring it, we're not going to find those phenotypes Except that we come at it through the genomic approach And then the thing I've already mentioned the conditional nature of some phenotypes A patient a person can Be apparently completely normal and then when exposed to a particular Environmental stress like the man we described at the beginning of this talk. I described at the beginning of this talk The phenotype becomes apparent and the last explanation is that Biological gene products don't work in isolation protein the protein products of genes don't work in isolation They work in complex biological systems And those systems have evolved to have buffering that is the ability to maintain homeostasis When perturbed either by environmental variables or genetic variables And much of that buffering and robust robustness comes Some of it at least comes from redundancy of biological systems. So you have two biological systems And they do much the same thing now I would submit if you look carefully most cases of redundancy or what I would call incomplete redundancy They sort of cross cover one another, but you can find conditions where only one of the two Systems really handles it and other conditions where the other system handles it So that means that there will be times when you find the phenotypes in there So I already alluded to this mouse experience about 30% are lethal lethals But all viable mice the large fraction or nearly all viable mice do have phenotypic features And the point is that these mouse knockouts are Essentially almost all 100% in all alleles. We don't know much about other model systems There's a spectrum the spectrum of Phenotypic consequences depending on the allele and the genotype is well exemplified by mutations in a gene called LBR Their homozygous loss of function you get a developmental lethal basically If you're homozygous for Only moderate loss of functions you may have a skeletal dysplasia But you're live a full lifespan And for heterozygotes for certain alleles all you have is an abnormality in the morphology of the nuclear Of the nucleus of polymorphin nuclear leukocytes called the pelgier hua anomaly so a whole Span of phenotypic severity all due to different mutations at that particular locus So if we want to find all the Mendelian genes we have to figure out ways of casting a wide net lots of people lots of phenotyping and and Looking carefully and rigorously Now another system. I just mentioned this that is undoubtedly Important, but we don't phenotyping at all is the olfactory system that we have about a thousand olfactory receptor genes in our genome We all know that some people have very exquisitely sensitive Ability to smell and other people can't smell anything The men in the room probably have been told by their wife. Can't you smell that and you say I can't smell that And so it turns out that the olfactory receptor collection is highly polymorphic And if you study people's olfactory capabilities, you find wide wide wide variations. We just don't phenotype it We sort of think well, it's really not that important, right? But it does influence it actually it's been shown to influence mate selection It influences certain things that you do in your life And if you look at other species Other than us it's critically important for example in mice Blind mice function perfectly well and they can't function if they have no olfactory ability So they live in an olfactory world rather than a visual world So and there are a few Mendelian disorders of olfaction that have been discovered But largely it's a whole swath of the genome. We don't pay any attention to And I just emphasize this point about conditional phenotypes with this one disorder I already mentioned g6pd deficiency Here's a boy That presented with seizures hypoglycemia and hyperammonemia 36 hours into an episode of viral gastroneritis when he was 18 months old He ultimately turned out to have something called medium chain acyl CoA dihydrogenase deficiency We actually screened for it now in the neonates. He was born before the screening in a state where the screening Program was not in place The point of importance here though is that this is an inborn error in the beta oxidation of fats And it only comes to medical attention when you put stress on the beta oxidation pathway So typically for children That happens when they get their first bout of viral gastroneritis What happens the baby doesn't doesn't feel good doesn't eat well the parents put the baby to bed without eating much supper About four o'clock in the morning now having fasted for 14 hours the longest the kid is fasted in their entire life They wake up seizing and hyperammonemic and in metabolic crisis Before the screening program 25 of the children with this disorder That first episode was fatal If you simply make the diagnosis and you avoid Fasting in other words you avoid stressing the beta oxidation system These people do fine and he's not had any difficulty Since the diagnosis was made and he's now a young man with two children of his own We of course did what we checked his once we made the diagnosis We checked his siblings and it turns out his older brother Also has mcad deficiency and it had one explained nearly fatal illness in childhood That was caused it was called the anachteric hepatitis And but he had that was a episode of med cad Problem so you only see it when the environmental the environment is stressed It leads to stress on this system So we're going to learn a lot about this I think from the undiagnosed disease network project that really got started here by the work of Bill Gaul and in mice on the not mice knockout project so-called comp And it shows the tremendous value of education if you You can't it's difficult to treat the genetic disorder Correct the genetic disorder, but simple education of the patient the family and the primary care physician can really Make a life Difference between life and death and in this disorder And there are other disorders And then the buffering and robustness really goes back to this man We heard about at the beginning of the lecture He was able through the robustness of biological systems of waste nitrogen excretion It turns out that your recycle has tremendous buffering capacity so that a 80% reduction in OTC functioning probably leaves you under most conditions to be just fine It's only when you have tremendous periods of protein breakdown as he did stimulated by his illness that you overwhelm that System So how many Mendelian disease genes my hypothesis is that if you look carefully and across a large population Nearly all of our genome Nearly all the genes in our genome Will have a Mendelian of phenotypes not all of my colleagues agree with this So we'll see Now let's come now back to the centers for Mendelian genomics who are tasked with finding all of these Mendelian disease genes So the overall strategy is to find well phenotype cases and families perform whole exome sequencing on relevant family members Use family relationships allele frequency data functional Predictions model organism results functional studies to identify the responsible genes and variants It's really a lot of fun very interesting very exciting when you get a hit And some things you solve right away and it's everyone Jumps for joy and then other things you plug away at for years and don't we don't solve And then we in the case of the centers for Mendelian genomics We have an online web tool that anybody in the world can submit their cases and once we get the Analyze the sequence and get an answer if we do get an answer We give the information back to the submitter and ask them to write the paper So they get all that for free You can get that service at that website So I if i'm looking for all the Mendelian disease genes i liken it to this I think of the world or the entire population of homosapiens as our sort of petri dish if you will And we're looking around the world to try to find those families and those individuals will have very rare disorders that represent the phenotype Of a particular genetic mutation. So this is a sort of ultimate genotype to phenotype connection And we're doing pretty well on that. These are the baler hopkins cmg data as of april 2016 We've got 9 000 consented samples and we've Got samples from 29 countries around the world. There's still big swaths of the population of homosapiens namely india and china They were really not getting much access into We've developed as I said a web-based tool to make it easier for a healthcare professional anywhere around the world to submit a candidate case Or family or cohort these papers describe that tool. It's called fino db And you can look at fino db. You could just it's free You just go in you put in your you register with your name and your email and so forth and then if you have A family or a cohort or whatever you can enter Them in there as long as they're consented appropriately and so forth And then we have a committee that meets every two weeks looks at the submissions The fino db takes all the clinical data in a very ordered fashion So we can quickly review the cases and ask is this a good family to carry through to the sequencing and analysis part of the Effort, this is the home page for fino db We have users from many many different countries around the world The baler hopkins instance of fino db. We have Data on more than 4 000 projects in there including 53 cohorts ranging from 5 to 295 subjects The phenotypic data on more than 10 000 individuals We have a whole exome sequence data on more than 6000 samples And there's an analysis tool about how to analyze the results of your whole exome sequencing in the fino db So it's very convenient because you can go from the analysis back to the phenotype back and forth And we're continually improving it And the we here is largely not a sobria The same woman that took gold david golstein up on his offer to do a whole genome sequence And out of hamish, who's the director of omim and then Really accomplished a program person named france washer to kate This is the very this is when you're analyzing your data. This is the starting page So you'll see here in this case, uh, you have a pro band and you're gonna Look at the sequence of his parent his or her parents So you put in you start with three antivar files. You've selected those Now you're at this stage, you're putting down the inheritance model that you want to analyze the data And you pick filters the frequency of the alleles that you expect You want to eliminate common variants that are present in these databases? You may want to, uh Fine tune the size of indels that you're looking for and a variety of other Variables that you can dial in and then you just push the button and out comes a list of candidate genes and variants These antivar files are created as you upload the vcf So that's done automatically and three standard analyses autosomal dominant autosomal homozygous Autosomal recessive homozygous and compound heterozygous are generated automatically And it automatically creates a file for pathogenic or likely pathogenic incidental findings in the acm g 56 Portable gene flavor so that it automatically goes to clinvar and asks Is this very have been seen and is it how is it classified and then comes back and gives us a dated time for the the For the issue of whether or not their incidental findings not in the consent form They check whether they want incidental findings or not It utilizes the phenotypic info in omem and the omem algorithm to suggest possible diagnosis when the phenotypes are entered and to flag If you once you get to candidate genes if that those candidate genes have been Connected to one of the phenotypes that it suggested It will flag it will make that connection for you And there's an api That's the sort of back end what the computer people call the back end that transfers the final results gene names genomic coordinates and features to To gene matcher and i'll say a word about that in a minute It's completely searchable on phenotypic features and genotype features And one of the additional tools to the issue of how do you decide that you've declared how do you declare victory? What's the evidence that the variant you have found in a particular gene is responsible for this phenotype? And it turns out one of the most potent Ways to do this is to find other patients Or model organisms that have variation in the same gene and a similar phenotype so Nada and france. Wah and ada developed this Another tool called gene matcher also free to anyone who wants to use it And there's the website and uh, it's designed to connect investigators So anybody can go in and put in their favorite gene And if someone else has entered that gene into gene matcher then both of you will get an email And then it's up to you what you want to do with that connection All the data is de-identified so irb is not required It's automated and continuous matching once you put it in there It's there for in until you take it out And so if someone matches you six months later, you'll just get an email. Let's say you got a match And it gives you the contact information And you can choose to collaborate or not and it also we've added Although initially it was just matching on genes You can click a box and decide to match on phenotypic features as well That was put in place in october 2015 and it's connected to this matchmaker exchange Program i'll describe in a second But here the this is the page in gene matcher for your matching options By gene match, which is required by disease match, which you can ignore or you can say i want it Or the location in the genome, uh, you can that's optional or A clinic of phenotypic features match that's optional Here's the data from a couple days ago. We had 4247 genes in gene matcher And there have been two more than 2,000 matches now We don't know which genes are being matched and we don't know who matches So I can't tell you how many of these matches Um, you know turned out to be lead to productive interactions. I do know Uh that in our in baler hopkins We certainly have solved a huge number of cases by this way We have a strong candidate, but we only have one case And um, uh, we find other cases with same similar phenotype And similar kinds of variants in the same gene So currently more than 1500 people are using this and from 51 countries Now, uh, gene matcher, which is this is the matchmaker exchange diagram of all these groups interested in rare phenotypes around the world And uh, so we built an api that connects gene matcher to decipher and connects gene matcher to phenome central Phenome central is the care for rare program in canada, which are rare diseases in canada decipher does rare diseases in uk and throughout europe And the world and so if you click a button in gene matcher, you will not only look in gene matcher, but you look at Phenome central and decipher so you get that added a bang for your buck if you just click it there The rest of these some of these others are planning to come in. They just haven't made it happen yet And as of april through gene matcher, we've made 81 matches through into phenome central and 74 matches into decipher So those pipelines are working well Now, uh, I just thought you might be in it. I'm near the end here, but what's the baler hopkin summary data at four years? We've had 9,000 And changed consented samples. We've studied 776 phenotypes 56 of the 56 of those were judged to be novel We've done 6,769 x-homes. We found a total of 468 disease genes Of those 222 were novel disease genes. That is they had not previously been connected to a phenotype And 246 were known disease genes now we try in our evaluation of candidate samples to Not do things that look like they have a disease that's already well explained But you have to realize that for many of these Mendelian phenotypes There are only two or three people in the literature and so the breadth of the phenotype has not been really fleshed out So the clinicians may look and say well, this doesn't look like the same thing Once we find that it's the same gene then we often see the overlap and realize It's what we call phenotypic expansion and we're just fleshing out the full Breath of the phenotype. So of these known disease genes 55 percent of them the patients that we studied had Additional phenotypic features that were not described in the entity so far and it's led to 124 publications currently Now finding disease genes some immediate consequences. It connects the gene to a phenotype Something geneticists have been interested in since there've been geneticists Connects the phenotype to a biological system and it tells you something about how that system works both in normal Under normal circumstances and under perturbed circumstances. So it's quite powerful It unravels locust heterogeneity which turns out to be extensive It enables precise diagnosis and counseling all the stuff we talked about earlier It's the first step in the path towards informed treatment at least you know precisely what the problem is So now you can begin to use rational approaches to try to find a way around it And it's a tremendous research stimulus from bench to bedside. So it's it's very powerful Now some long-term consequences Suppose and this gets to our long-term goals Suppose we had phenotypes for more than 50 of the genes in our genome. Remember I said right now It's about 17 or 18 percent What questions could we ask? I sort of think of this as a classic Forest and trees analogy, right? So right now we're getting very good at finding a particular tree in the forest And we're going all around it seeing What how many branches it has and how tall it is and all its individual variations But oh and it's very exciting yet every tree is interesting But what I'd like to see us do as we get farther along this is to be able to stand back and say Each of these trees is in a forest that is in a human being and What are the principles we're learning not from this individual Genetic disorder but from looking at large numbers of well explained genetic disorders Can we see new principles about how disease works? What genes are important? What variants are allowed tolerated and so forth? So that's a sort of a long-term goal. It's a broader and Goal And actually we've been interested in this for some time. This is a paper We published in collaboration with Lazlo Barabasi nearly 10 years ago and we just simply Tried to look for interactions between all the diseases all the genes that were known to Be responsible for men dealing disorders looking for interactions for patterns and so forth Makes a nice diagram, but didn't really get us too far along And we wanted to know are there unappreciated principles of disease? And if so, what are they and what do they mean for how we think about disease at the time we did this study? we had 1700 disease genes so now we're you know a little over two times that number and I think studies such as this will be redone over and over again until we began to really get A sense of how it all works Here's an example of the kinds of questions you would like to answer so Just about networks. We talked about biological systems and networks as buffering disease You could ask are all networks equally vulnerable to mutation? And if not, what are the rules? We don't really have any rules for this question as far as I know Or we could ask are all components of a system equally vulnerable? Or if not, what are the rules that make some components more vulnerable than others? Can we predict the consequences of variation in one component? On the behavior of the biological system in other words if there are 30 Proteins in a biological system. What happens if this particular one is reduced in function by 50 percent? Does the system still work pretty well, or is it completely crippled by that change? So here's two systems one is the rasmap kinase pathway. Here's proxizome biogenesis pathway Each of these systems Involves about 30 genes so they're roughly in terms of that parameter the same size This pathway has more than 15 discrete phenotypes. This pathway really has one to two phenotypes No gene predominates in this pathway in other words every gene In the pathway has been pegged with mutations causing particular Mendelian phenotypes In this pathway actually about 60 65 percent of the patients with defects in this pathway have it in one gene Gene called pex one So those are two biological systems of roughly the same genetic size But yet they are quite different in terms of the mutation the size of the mutational target or the the target that yields a phenotype And the kinds of phenotypes that they produce and we don't really know enough about it Now to look at any given biological system and be able to make meaningful predictions of that type We should be able to and we will be able to I would predict as we go forward with this project So, uh, I have no idea where I am here time wise So we're done So let me I there's this some quick samples here So I I want to just say one word about this The I hope you get the sense of the incredible power of Mendelian disease as predicting biological things that we just didn't notice So this is a disorder spondylo metaphyl dysplasia and the patients have two features They have a cone rod dystrophy That is a severe visual impairment and they're short-statured with skeletal dysplasia looks sort of like achondroplasia Now i'm sure all of you sitting in the audience will immediately say well That's obvious what would cause that phenotype right connect those two different biological systems I mean when I looked at this I said I don't you know, I don't have a clue what we're what could possibly What gene would possibly bring these two systems together? It's a rare autosomal recessive trait. There's the macular degeneration The gene turns out to be a gene called pc yt 1a never heard about it until we did the study But it encodes an enzyme called phosphocholine cytodillul transferase Which is the enzyme that's the rate limiting step and the bio and the major pathway for phosphatidylcholine biosynthesis, that's a major component of plasma membranes Some cells that makes up about 50 to 60 percent of the plasma of the lipid structural lipid in the plasma membrane So it's clearly an important molecule So I still don't know the answer to why those two systems are affected But I do know that there are cells in both those systems That have a tremendous demand on membrane biogenesis the photoreceptors in the retina Make a lot of membrane every day and actually the osteoblasts Make a lot of membrane they enlarge from their sort of resting size By about 30 x that means they need a 10 x increment in membrane And so both of those cell types have a huge Demand on membrane biogenesis. So maybe that's why they're the ones that Show the phenotype there's an alternative pathway But apparently that alternative pathway is not adequate at least for these two cell types with a big demand So those are things that we didn't think about until this sort of predictive thing came along Here's another disorder. This is impressed right now Tilo 2 It's worked on by a student of my lab Jing Yu and it tags a Complex called the TTT complex never heard of it before But it's involved in interacting with HSP 90 and the R2 TP complex And does maturation of six enzymes that are very important in central metabolism of all cells In fact, those so-called PIC genes have already been tagged with Mendelian disorders So again, we're getting more information about a central biological pathway That's going to be really important in terms of understanding brain function and other functions The baler group the baler part of baler hopkins has published this paper They looked at 128 Consanguini's families and found roughly I can't remember the exact numbers but it's about 48 known disease genes and some of these in The subset of these families and another 40 or so high level candidates and the others They put this all together and they ask when are these genes expressed some are expressed in early embryonic life Some are expressed in fetal life And so forth so You're sort of again beginning to put these biological systems together and these biological systems are very important for the Normal morphology and functioning of the brain So we're beginning to move from an individual tree in the forest To stand back and beginning to understand The size and the shape and the behavior of the forest itself At least in this case in the in the development of the brain The last example also from Jim is looking at purple neuropathies The charco-marie tooth Neuropathies that are now we now know about 65 genes tremendous locus heterogeneity And that monogenic disorders in each of those 65 genes can cause a charco-marie tooth like phenotype It turns out However, that there's a sort of a genetic burden principle So we think of these as monogenic disorders But what Jim did is score the genotype of all 65 genes in each of these patients So it shows this in red is the distribution of loss of functional leels in these patients Versus the normal population and looked at two different populations these data hole up And what they say is that you have your genotype at the disease gene locus But that is also affected and that causes the Mendelian disease But it's also affected by the genotype at all these other 65 genes And if you have additional variants at some of the other genes that will make this phenotype more severe or less severe So you get the sense of genetic burden the architecture of genetic disease from these focused studies So I'll finish with unexpected and emerging ideas coming out of these this project so far The first is it shouldn't have been unexpected Unexpected but the extent and distribution of genetic variation is just really enormous. We still haven't Enumerated all of it. We're finding tremendous locus heterogeneity We all knew about locus heterogeneity for certain phenotypes rp and hearing loss stuff like that, but we're finding it everywhere we look There are many examples of phenotypic expansion that is We're fleshing out or expanding our understanding of the phenotype for particular disorders And this turns out to be very powerful We always medicine over and over again forgets that you describe something in three patients And then we start thinking that the phenotype the aggregate phenotype of those three patients is that disease And if we if we described another hundred patients, we'll find out that it's actually quite Different from the phenotype of just three We're finding an unexpectedly large role for copy number variants and de novo mutations and a lot of Mendelian disease relatively high frequency of two diseases occurring in the same difficult to diagnose individual And we're learning as I showed you on the last slide a lot about the genetic architecture and genetic burden for disease If you want to read about it this the project was Described in sort of the first three and a half years in this paper published in late 2015 And thanks for your attention. This is a baler hopkins team And i'm glad to answer questions. We're at the end of the time So if you want to just You come up and see me. I'm glad to talk