 So, I have been given the task to represent the Undiagnosed Diseases Network, which includes lots of different sites and different missions, and I'll review that. I'm not going to talk about the RDCRN, which Chris Austin alluded to, but I'm happy to over lunch if anyone's interested. So just a quick disclosure, our department gets support from Baylor Genetics Laboratory at Joint Adventure with Morocco Holdings. This is really, I think, the major question that we as certainly clinicians and researchers are focused on and increasingly. How do we functionalize the genome in the face of increasing discovery of rare and or potentially unique variants in individual cases or families with a clinical diagnosis? And I think one of the contexts of trying to solve this problem has been the Undiagnosed Diseases Network, another program of the Common Fund that's been run by NHGRI. This is Phase 1, description of Phase 1 geographically of the program, which went from 2014 to 2018. It's composed of seven clinical sites, two national sequencing centers, and Metabolomics Core as well as the Model Organisms Screening Center. At Baylor we had the clinical site, the Model Organisms Screening Center and one of the two sequencing centers. And I'll describe some of the experiences and lessons learned from this first phase and hopefully will inform how the collaboration with COMP can go forward, especially in the second phase. This is probably a little bit outdated, slide it, February of 2018, provided by Anastasia Wise. But basically, there's been over about 2,000 patients who have been applied. So this is a patient-driven program, so it's somewhat perhaps a little different than what we've heard about today through the gateway. The applications are distributed to the different clinical sites and they are then either accepted or rejected. And from there they go on to in depth both clinical and research-based phenotyping, which then includes but not based on just sequencing, primarily an equal distribution of exomes and genomes have been done, mostly focused on sort of a trio-based approach. And then sort of a flow from there to either Metabolomics Core or in fact the Model Organisms Screening Center, which in phase one has been focused primarily on flies and on zebrafish. And that output has led to approximately 177 diagnosis there more now that have been made in terms of this cohort. Now the experience was described and it's coming out in a paper that's in press. But this sort of summarizes that 40% of applicants have been accepted for evaluation, 32% actually had clinical exome sequencing when they came to us. Not surprisingly, a majority of the phenotype were not neurological, 40%, with the rest of the organ systems making up the rest musculoskeletal run second in terms of close to 10% and then equal distribution to other systems. The overall diagnostic yield is 35%, almost viewing this as sort of a research reanalysis of an individual who may have already had a clinical odyssey type of evaluation. A decent number, 11% were based primarily on clinical review, 74% was based on sequencing, a fifth led to a change in therapy, widely defined, and 37% led to actually changes in diagnostic testing and there have been 31 new syndromes defined. The flow of work as I alluded to in the first phase is sort of patients to the gateway run by the coordinating center and then to the clinical sites and then ultimately to sequencing metabolomics and the model organism screening center. Importantly for this group, this run by Hugo Bellin at Baylor in collaboration with the zebrafish team at University of Oregon, heavily depends on informatics approach. They've developed this marvel platform which tries to aggregate both human and model organism data which then leads to a prioritization and assignment together with the clinical site. So this is important because this is a very integrated and bi-directional interaction which I think is important in any sort of model going forward for collaboration will then flow either to the fly or the fish core. During the process there were 50 genes prioritized for functional studies, 40 in flies, 13 in fish, three in both and they have certainly contributed and provided diagnoses in at least 12 cases and ruled out candidate genes in eight and they were inconclusive in six. Now as part of this process, Baylor, the Baylor site I think partly because of the Baylor involvement in the IMPC and the COMP program had a small supplement to start to investigate direct interactions almost two years ago and that's now evolved obviously into a more broader engagement of the mosque with COMP II specifically. And I think there's a lot of reasons why this makes complete sense. There are oftentimes, there are genes that have no mouse knockout information that are available that are being prioritized for COMP so this is one certainly reason. I think there are also UDN variants that have been found in Drosophila but have value for further study in mice but probably most importantly a significant number of genes are obviously duplicated in fish that are difficult to study in either fish or flies. Certainly I think they're increasingly a significant number of genes where we look at specific variants and neither fly or fish are the optimal model and those have been really prioritized in terms of collaboration with COMP. Currently the status of the collaboration, there are 28 genes being studied of which there have been multiple alleles, not just loss of function. There have been 21 no alleles already achieved. I understand there are additional five in production and eight have their phenotype completed. I would also add that as part of this the effect of the collaboration is amplified because there's obviously expertise and I'll give you an example of this in my own area of research where we can then leverage COMP to further perform deep phenotyping that really sort of closes the circle, sort of building on the foundation of COMP and I'll show an example of that from a skeletal perspective where we as a group have been very focused upon it at Baylor. When we sort of put sort of the model organism component in the context of what I view as really a multi-ohmic approach to clinical diagnostics sort of espoused and modeled by the UDN, whole exome and whole genome sequencing is certainly powering a genotype first or concurrent approach and there's up to 35% discovery that I think many groups have seen on reanalysis of which about 7% are blended. So this is also interesting. How do you prove an oligogenic inheritance or causality in humans and obviously model organisms may be one important approach to do that. Functionalizing the genome while metabolomics has been investigated primarily as a research tool. Baylor genetics has implemented this and I'll show you some of the data from a clinical perspective and what the yield is. I think we've been very focused at Baylor and now also several UDN sites of integrating transcriptomics and I think that actually further underscores the importance of model organism because we're learning a lot of things in terms of the context of clinical diagnosis that I think probably are not unexpected but underscore the complexity in terms of clinical utility. This of course is key and I think COMP agrees with this and IMPC agrees with this clinical phenotyping whether of a human and a mouse is absolutely critical because of issues of variable expressivity and incomplete penetrance. I think that even without a specific molecular diagnosis there is potential medical action ability and I think we've seen examples of that. And this then you know again leads to the practice of medicine which is management sometimes without an identifiable cause and so there is a role for gathering information and knowledge even without a diagnosis in informing management and of course model organism study is critical because increasingly hopefully I'll convince you and maybe I don't need to the interpretation of variants of uncertain significance is key for both known and unknown disease genes especially because of the context of phenotypic expansion. Now I would underscore this, this is somewhat outdated and this is now being revamped but the 2015 ACMG guidelines of actual clinical variant interpretation and I think you know Alex's work with ClinGen are informing perhaps the next phase of this. This is the category of strong and within strong evidence includes and you can't read this but basically in vitro and in vivo functional data and I would put forth that ultimately with good in vivo functional data this sort of should be moved further into the very strong category and I think this is clear this is important because from the perspective of interpretation and clinical practice certainly diagnostic laboratories follow the ACMG guideline but I see this sort of as a move especially as we run into more of these rare and unique variants. The metabolome is clearly a very important component of it. I know this is a big part of the phenotyping here in the IMPC but we have a lot of human clinical experience with this. I would just summarize that there's no question that from an inborn errors of metabolism perspective metabolomics I think has the potential to really be the best for a screening test. It's not very useful in necessarily managing patients but as you can see from data like this generated at Baylor Genetics and Sarah Alcia when you look at the distribution of metabolites in the metabolomic assay via Z scores it will pick up as an initial screen many of the classical inborn errors of metabolism but even more importantly I think can replace two other tests as an initial screening tool both acyconectin profiles as well as potentially urine organic acid so I think the data on that are coming out that perhaps as a first level clinical tool it may be very useful in replacing potentially three different classic biochemical tests. Over a thousand clinical samples actually 1300 have been performed in the last three years at Baylor Genetics and if we think about the potential contribution in about 37% of the cases there was information that was helpful and it was helpful in the different types of areas. It was helpful in ruling out potential variants from diagnosis it was helpful in terms of changing variant classification and it was helpful in confirming molecular diagnoses and this is of course both in the context of classical biochemical genetic disorders as well as biochemical genetic or disorders that are not primarily based in classical inborn errors of metabolism. If we were to sort of step back and ask what is the overall rate defining it as a test that actually identified an IEM as well as a test that corroborated potentially or supported a genomic diagnosis of an IEM we saw that it was as high as 7% so independently very useful from a clinical perspective. Now I think where I've certainly been very excited about and I think increasingly the UDN has been in moving transcriptomics into the clinical arena. This is an analysis that Mahim Jain now faculty at Hopkins but previously a fellow in my lab did on our UDN data set and really the take home message is that when you systematically do whole blood and fibroblasts based RNA sequencing in patients that a significant number of genes in fact are expressed based on different thresholds of FPKM whether it's OMIN genes as a total 64 and 40% respectively expressed greater than 1 in 10 not surprisingly mitochondrial genes very highly expressed primary immunodeficient genes and skeletal dysplasia genes less so not surprisingly would be genes related to let's say non-syndromic hearing loss but clearly powerful insight into a significant percentage of genes that contribute to human disease. And so in the first phase we at Baylor sort of systematically did trio RNA sequencing as part of our protocol as well as other sites began also doing this. These are aggregate data of cases that we now are putting together as an experience from phase one from Baylor, Columbia, Stanford and UCLA the Columbia site is actually Columbia and Duke and you can see the distribution of cases importantly the distribution of variants so this has been an approach of primarily looking at RNA seek as first a tool to help inform interpretation of whole exome and whole genome sequencing and in this cohort there were approximately 165 variants which were analyzed. Now what types of variants were in fact put forth for analysis? Not surprisingly a significant number 50 over 50% were deep intronic variants as defined by greater than 10 base pairs into the intron. Similarly other sort of variants including classical splice junction variants as well as intronic variants 3 to 10 base pairs were high so clearly these types of variants whether coming out of exome or genome analysis are the ones which RNA seek were could potentially inform on but as well as many of the other types of events that you can imagine frame shifts that led to indels and in frame indels missense variants and so forth. Now this is hard to read I think from given the projection but the take home message is in fact when you look at the RNA seek data and the consequence or the association I should say between a specific variant identified in exome or genome with an observed effect on transcript is quite diverse. If you look at the first clustering of classical splice variants, intronic variants, deep intronic variants not surprisingly a big majority shown here in grade it had no effect but it's a significant number in fact could lead to all sorts of consequences whether it may be exon skipping or it could be intron retention and so certainly even deep intronic variants as you would expect could have an effect in terms of consequence. What's also very interesting is when you look at nonsense and frame shift in dels sort of variants which you would expect could cause nonsense mediated decay they're clearly situations where it doesn't lead to nonsense mediated decay but more importantly we see a threshold it's not plus or minus and in fact we see quite complex combinations of relative expressions or relative NMD and one allele versus the other which may harbor in a recessive disease and missense and so you know one one sort of hypothesis coming out of this is in fact expression pattern and the nature of expression from this allele insist with the variant could also be a major driver of pleotropy and variability and even in the context of missense variants there can be a diverse effect that one could not predict which again underscores the importance of modeling studies in terms of really taking interpretation to the next step. What is sort of the ultimate yield of the clinical impact and again obviously this is context dependent upon how you define different effects this is obviously based on sort of one context which I won't have time to go through but sort of if you think look at in the yellow it's clear that in a significant number of cases RNAC contributed to the actual characterization of a variant to being solved whether it was a prior candidate or in some cases it wasn't even a candidate at all prior to RNA sequencing so that's a value I think that in terms of other value we can change a strong candidate to a weak candidate in terms of deep prioritization it can continue to support whether it was already a strong candidate or it can even help to rule out and this is especially the case when you have a recessive disease and you have one potential dilaterious allele but we have absence of a second genomic finding and the finding of sort of normal expression from that allele has been very helpful and so we've sort of taken the approach that in total about 21% of the cases were helped in terms of a solution with RNA seek and actually I would say 12% were cases solved where it was something that was sort of not readily apparent just from the genomic interpretation like an intronic or a synonymous SNV or perhaps an inframindal so again really depending upon how strong the nature of a variant is RNA seek can have increasing impact so what are the opportunities and challenges for clinical implementation I think that RNA sequencing is clearly effective for prioritization of variants coming out of Weston and WGS from the context of allele specific expression nonsense mediated decay and splicing isoforms something that I think as a field we've often ignored clinically you know there are four to five isoforms different isoforms and different for each gene and in terms of our clinical interpretation we've basically been had to ignore that it can sometimes identify new pathogenic alleles especially in the context of a recessive disease where you have sort of a beacon in one allele already we systematically looked at the current informatics algorithms we have I don't have time to show you that data they're helpful but as you would expect just like the informatics algorithms for interpreting DNA variants they're not definitive by any means there's no question tissue specific expression may limit the implications or the application for a subset of candidate genes analyzed there's no way good way around that other than probing that tissue pathway analysis is actually difficult while the other deliverable maybe to start to correlate pathway signatures that we see for example in human tissues versus model organism tissues I think a enormous potential it is difficult because we're not doing classic case cohort type of design it's a singleton a patient versus perhaps a group of individuals and so that's going to require different informatics approaches one problem which is clearly an issue which I have not seen any group tackle effectively has been to use RNA seek as a de novo tool sort of a priority to identify candidates much like we do today with DNA based analysis and part of this challenge has been really the abundance of novel low-frequencing splice junctions that we see in RNA sequencing approaches so there's just so many of these it's difficult to sort of a prior we say well what what's interesting what's new what could be you know pathogenic perhaps trio analysis can help we are systematically trying to look at that but we don't have an answer yet and of course the real-world sort of logistics of payer challenges of how to pay for this I will end because I know I'm standing between you and lunch and we're behind with sort of two examples that we've sort of seen where I think our lens itself to really a rapid translation in the context of modeling one is when you find a deep and tronic variant and even where RNA seek has been very helpful but you have basically an n equals one situation so where even the RNA sequencing may be helpful doesn't really prove it this was a family that had a noon and syndrome like phenotype and where we identified just by straight whole exome sequencing headers I guess stop mutation in LZ TR 1 which has been implicated in that pathway but what was interesting is that when we looked at the RNA sequencing and here you have control blood control fibroblasts and there were two Sibs one in one Sibs just for demonstration fibroblasts and the other is blood was abnormal splice pattern in this in this gene as well as they were alterations in the allele specific expression which pointed to what we then found doing whole genome sequencing was a deep and tronic variant at minus 256 position of this intron of this gene which ultimately we think led to a whole host of the different splice patterns and in fact we did see that in sort of an in vitro experiment where we knocked in that variant so we had isogenic controls and reproduced different spice isoforms still this was a question is this really consequential in the context of a first recessive form of noon and syndrome now at the end of the day much like in the CMG when there has been you know other families have been identified that supported this we were able to make the supposition but you can imagine the challenges in this situation where in fact you have an intronic variants where there is some data that supports its pathogenicity for actually a recessive form of inheritance where a model organism like specifically in mice because it isn't intronic change could be very powerful and it turns out that this region is highly conserved across human and mice I think a second situation where again at Baylor working with comp actually delivered on sort of a pathogenicity conclusion is a variant in this protein copy to which we found in a patient with early onset osteoporosis this is part of a complex ER Golgi transport primarily ER to Golgi which has been implicated in many skeletal diseases and the RNA seek data supported in fact that there was nonsense mediated decay for this allele but does in fact haploinsufficiency cause an isolated skeletal phenotype in what is a ubiquitously expressed gene and in fact rapidly working with comp we're able to generate the loss of function as shown here on Western analysis and what is low bone mass phenotype in heterozygous males and females supporting in fact the the pathogenicity of this and in fact functionally and so this is where again the next step leveraging local resources is very useful we're able to do biomechanical studies that actually showed the bone were weaker I mean again supporting that this is a heterozygous change that's leading to an early onset osteoporosis in this patient and potentially in the more general population so I think there are opportunities for comp modeling of human V us in the context of clinical discovery and diagnosis there can be multiple inputs and I think at Baylor between the diagnostic lab the various research consortium CMG UDN there's been a increasing flow for this I think another opportunity is industry which has not been discussed here I was just at the American Society of bone and mineral research where there have been multiple new therapeutics I've hit in rare disease context and you know as you can imagine many of the industry are focused on identifying as many patients as possible and I put forth a proposal to them you know they have been sequencing everybody with certain types of phenotypes and they discover a lot of V us's one approach may be to comprehensively model the V us's especially on known disease genes which have therapeutic options you can imagine you're taking really a different approach it's discovery that's based on a situation we have a treatment and so that actually increases the potential value of you know this type of program at least from the therapeutic perspective I'm given again Chris's points early in the morning clearly rear missense variants in the context of phenotypic expansion or milder phenotypes whether they be hypomorphic or neomorphic alleles are extremely I think powerful from a modeling perspective difficult for us to interpret in the clinical perspective but even apparent loss of function variants and some of the RNA seek data are clearly showing that variants that you would think would be apparent loss of functions can in fact have different constants consequences on splicing and expression and so there may be value in modeling that in vivo in a mouse especially if there is conservation of those genes there is no question as you can see how we sort of drove as a group RNA sequencing interpretation it's driven primarily by non-code invariance given their defects of diverse effects on transcript and isoform expression so that's certainly an enormous opportunity and ultimately it's to improve interpretation in the clinical context and I'm hopeful that in the new iteration of the guidelines in vivo functional data will actually be even elevated and importantly you know we all see the OMIM graph with the number of genes going up rapidly but you know in fact the correlation of known genes with phenotypes in the context of phenotypic expansion is an enormously exciting opportunity for you know studying unique structure function correlations and so I don't think we should forget about the basic science opportunity from that sort of portion of the OMIM graph so with that I will leave you with this is phase two of UDN which has expanded now to 12 sites there is now one national sequencing court Baylor and the model organism screening center has actually expanded to include worms and hopefully there'll be even a more robust flow into IMPC and comp and sort of this is the the schematic as it now stands for phase two and I want to acknowledge really all the folks involved in the UDN thank you Brendan I have special across yeah there's no question that RNA-seq is valuable and in the data you showed I think you combined leukocyte plus fiber blast RNA correct and could you speak to how valuable straight white cell leukocyte yeah so I have some data on that which I can't show so I can't show because I don't have it I would show you otherwise so not surprisingly if you look at clustering fiber blasts RNA-seq clusters much better even across unrelated individuals if you do sort of the clustering analysis and and white cells are much more distributed as we would expect because it's a you know a primary heterogeneous mix of tissue having said that I think for certain diseases it's actually quite informative especially for the primary amino deficiencies and so right so if you could you want to go back to this slide just go back quickly so if you look at autism genes about 62% express greater than one between one and ten and about 32% greater than ten so obviously less useful but again you know if you think about our diagnostic rate of 35% that's not bad it's all context so so I think there clearly is still a use even for you know autism or neural degeneration type genes absolutely absolutely that that's exactly correct this is this is why I think that you know we presented as thresholds to give you the dynamic range so between 60 and 30 depending upon how lucky you are good topic to discuss lunch so we have an hour for lunch if you can get back we'll pretend it's quarter of and not 10 of get back by quarter of do everything we can to stay on track thank you