 already with thanks everybody and I'm here as representative of the IMPC I've been on their advisory board for for several years and it it's it's been a labor of love and so just to to baseline us we have a goal for 2021 to 2030 and part of that goal is to advance the the number of nulls and phenotypes which which we have now it is well as to make the data integrate with other data sets and not evolve away but to a durable impact so the slides I have there's there's no projector and so with that we had a slide here that Chris Austin showed about some ideas that we might consider for our next phase and we all think about mice as mammals and humans as mammals too so with that I would like to thank the folks here that bring to us a human perspective and for us to use this this hour to talk about the commonalities of our shared goals of mammalian phenotyping whether it's mouse or a human so with that we'll we'll start the panel and I had some slides which would help to baseline us we'll wing it and we'll just just use the microphones so thanks well I mean so there are two things that I think about that I've been thinking about over the last year given the context that we're entering year three of the Comprogram we have two and a half years to go we have about a two-year typical planning process unlike a CGE miraculously came to life in one year we're not going to try and do that we're going to take our time and plan things so on one side it's the scientific issues on the other side it's the political issues it's aligning program goals with I see interests and understanding funding mechanisms and mechanisms of doing diverse support for large programs which is essentially I'm saying common fund so for I think for the NIH side all of our thinking is colored by those practical issues probably from the science side you can just think about the scientific issues so maybe we should start there start with our success stories which is a CMG collaboration and a null phenotype resource it seems to be working really well are there things we need to do now immediately in the next two years to improve that resource to make it more accessible to the clinic to make it more relevant and more efficient so we've heard the success stories would you like to say some would you like to scare us with the things that keep you up at night that you worry about that are problems that you know we need better solutions for well I must say that I think I found this I only was here for this one day but I thought the science and the opportunities that we heard about throughout the day were really quite astounding so I would think that we should all go home one enthusiastic and two invigorated from today's meeting Chris laid out several challenges at the beginning of his at the beginning of the day Chris and I were just talking at break and I was saying that I am recently become much more enthusiastic about the prospects of developing therapies and part of that enthusiasm stems from the big success of CF but there are other examples as well and but more I suppose a more profound sense of optimism comes from the at least on the Mendelian diseases frankly from the progress and Mendelian disease identification and that effort I did I didn't make it clear my talk but the numbers that I gave you the numbers for the CMGs but obviously the communities lots of people in the community doing the same activity so overall the number of new disease genes has gone up dramatically and so your chances of developing a biologically rational cure are much better if you know what the molecular basis is to start with and so we're making tremendous progress there the path of physiology is always the more slower activity but the work of the mouse geneticists in terms of making models which are not only helpful for disease gene identification but also provide the resources that are necessary for studying pathophysiology and understanding the functional derangements that go along with defects in genes that we never heard about before Nico mentioned a couple of them in his talk that are just eminently forgettable except they're very important biologically and so just a lot of things that we wished we could have had as as little time as five years ago we have now the other thing I must say is we're hearing about some funded work from NIH I get the sense that there for good creative projects there's money available and that we haven't been able to say that until just recently so we need all to make hay while the sun shines I think and so I found that very all of that very positive in terms of the collaboration I must say based on the work between the CMGs and the comp program comp PIs that I think both groups are eager to collaborate and there's a sort of an energy of activation I know Bob and I talked about collaborations for a couple years before we actually got it rolling but like many of these things once you get it rolling it it gathers momentum and I think both Steve and I feel like once we begin to see mice with phenotypes the momentum will build quite rapidly and that momentum will be generated not only by the CMG centers but also by their submitters the people that are remember I said they're supposed to be the ones right in the papers and and they will love having this data from the comp project as well as and I didn't make this opportunity if you know you want the mouse model there's an opportunity when the live mice are still on the counter on the available on the shelf I guess you guys say that there's an opportunity to give those again that shortens the whole process and moves things along much more quickly so so question for you Dave and rest of the panel is when you want the the information or need the information and I had a slide on that but to speaking mouse or speaking human the the ontologies can be so so different at times in the accessibility of the flat files the accessibility are you going in by gene or phenotype and what's the phenotype called so what would be most helpful to you and to others in the human in the clinical in the patient centric of facing fields because we have our own jargon right and we would like to make ourselves useful or data useful to to your needs well I'm not the most facile informaticists in the room that's for sure and but I take what I find is that the bedrock at least for me are the gene names and so one can go back I find that one can go back and forth pretty easily that gene symbols the universal gene gene symbols right and I think the the databases for both species are also not too complicated omem and related human catalogs and then the mouse databases are also quite I think easy to use so I don't find that much for problem I do think that as the CMG comp collaboration moves forward we're gonna as I said we'll we'll build better more transparent informatic tools to help with that that very problem that you mentioned when I say we'll build it won't be me somebody will build it I mean so we heard about the situation in the UK where they have the National Health Service they seem to have ready access to the full medical record we heard about the UDN where they bring the patient in they do their own in-house phenotyping and genotyping is there an issue trying to reach out to these X01 these more loosely connected collaborators how do we make contact with the bench researchers to do follow-up stuff and make our resource available to them it seems to me that's a challenge they know their phenotypes at every CMG there's some clinical experts and then at every CMG there's some clinical experts that can talk to you at length about the phenotype and they can refer you to the submitter who's actually sitting there in front of the patient and examining them and I think so yeah I was just going to say it was a year ago that Dr. Kent Lloyd approached us and sent us an email saying we are comp to here we are not UDN and we are not CMG either so if you are what I call grass root clinical geneticist working in the clinic you just heard a beautiful presentation by Dr. Rowan as we are doing almost the same things and finding a lot of variants we are unable to do gene discovery as such because you have to look for means to find collaborators I feel like we were very fortunate Kent Lloyd approached us and said here's what we have this facility and availability of the comp to project so the way we started off is what are all our variants of unknown significance that we have found and do we have any genes of unknown significance then we also came up with another another I guess category where the genes were already identified but are they phenotypically going to add more which we also heard in the last couple of days which would be our gene of interest so interestingly for one of the patients the mouse model was already there and the phenotype was slightly different where there was cardiomyopathy so we were able to take that back to our patient and say can you go do an echocardiogram for this patient so we make sure that there is no cardiology issues going on for our patients so I think it was it's a very good way of back and forth and gene name would be a great way of looking for mouse models can you give us a rough idea of what your demand would be for number of variants in a clinical year that you would be seeing and have that the level of questions around does it vary have you found that it's pretty steady state but I would say that just over the past I guess six months we find it like up to 30 variants from my clinic alone and that's only like a small percentage so we would be finding tons of variants which brings us to the next question how are we going to prioritize what genes we are going to be working on and what which ones to choose because maybe I have like 20 genes that you know like they're willing to model so are we going to look at the most common ones or the ones maybe are going to find the max maximum possibly possibility of identifying it in your mouse phenotype or the ones that are most severely affected or the ones that affects most individuals so that's kind of a question I so so so in our six part strategy for a decade for phase three and four that there really covers a lot of the first half of it and beginning the dialogue now for what type of alleles is the black six background or timelines compatible with your need for information to clinic would this be used as a diagnostic tool to help understand the treatment which would be a more immediate concern with a patient that's coming to see you or is this something that you have as an academic purpose that the timeline wouldn't be as critical because you know for us to understand to put ourselves in your shoes is is really important today so when you are academically is very different than being a clinician in front of a patient because most clinicians would experience that on the one hand you feel the excitement of finding a diagnosis on the other hand when you're giving the diagnosis the next question the immediate next question is so we have found the answer what can you do for my child what's secure is that a cure or what's the next step treatment and yeah go ahead I just I wanted to touch on the variants of unknown significance first so just for in the human exome we usually get about 20 to 25,000 variants coding variants and so the whole trick is to late to win or down that number to get as close to one as you can simply by frequency known functional facts model organism databases essentially reaching out to every little bit of information you can find available and doing that it often is possible to get it depends on the mode of inheritance but it's also often possible to get down to a rather small number of reasonably strong candidates then a tool like gene matcher is very powerful because it's just you just put the gene name in there and then you just don't have to do anything until you get emails and finding other cases with similar phenotypes with the rare variants in your favorite gene or one of your five top candidate genes set you off in the right direction right away and can help prioritize the variants but these are very much research questions they're not the in the clinic I think it's fair to say that if you if you get a result in a known disease gene and your variant looks like it's functionally significant then you begin to have some traction in terms of what you're going to be able to say to the parents to the patients even even if the variant has not itself been described before okay so so that could you just so Karen had talked about that previously can you expand a little bit on that I mean it must be such a changing situation from years ago when you took your boards and there were three known genetic diseases and they were well described and when you said to me to the patient you knew the prognosis you knew the extent of the phenotype you knew every now you're overwhelmed with new diseases where that patient may be patient zero what do you say to them in that situation because you it must be difficult we say exactly that that you know like it's so rare we don't have much data on this and there are probably you know like if it is a really known disease causing gene which is what we are dealing with right now but that there are only 30 other individuals and then you know like I think most parents grasp the enormity of oh oh my child is has got such a rare disease but it depends on the situation some people are like feel the sense of relief to have known what to know what the child has and there is a reason for it and then though there are others who immediately get on so what can be done what do you know about it and where can we go in that instance we do look into like who is working on it and so we may find a researcher who is very interested in that particular gene that we are looking at and look to see what's being done so there may there could be a natural history study going on so we'll send them for that there is one particular gene where there is brain ion accumulation so we look to the mouse models to see is there anything you know like in the mouse model and if there is and if there is brain accumulation then there is ion kilators available and some are penetrant blood you know like they they cross the blood brain barrier so that's where we are beginning to look to prioritize as to is there something we can take back to the patients if we do develop a mouse model and if there are phenotypes that could for which we could do either drug screening or some therapy where we could see tangible results at least quantify something I think that's how I'm beginning to look at things and if I could just add one comment when I was seeing patients in the clinical setting I used to joke that if I was one internet search ahead of the families that were seeing me I would probably be okay in the clinic setting but you know really and families are sometimes their own best advocates and I feel that it really becomes a therapeutic alliance so to speak in terms of encouraging them to find other families and to be empowered by being able to connect with investigators I think the internet and the the availability of these tools on the matchmaker exchange and some of those that have been described at this meeting I think is really quite remarkable and you know some some families will take their child's care sort of into their own hands and we'll reach out to researchers and find ways to you know advance a therapeutic development which is quite remarkable yeah I mean we saw the UDN reported on in the New York Times recently as a medical mystery case right a couple weeks ago purposely to attract the attention I mean I thought maybe they were disappointed with matchmaker that wasn't good enough they had to go to New York Times but it was kind of an interesting story but yeah so we're kind of wondering is there any is there any outreach we should be doing into the patient advocacy community that would be useful I think this one thing that this makes the point of is that as we heard earlier today geneticists currently do a lot of education so when you send let's say a whole exome a clinical whole exam and a patient you if you want things to turn out as best as possible you need to explain to the family what kinds of data you will get you may find an answer that answers the problem right flat out out of the bat out of the box you may find a lot of data that you're not able or you will find a lot of data that you're not able to explain perfectly and you'll be able to rule certain things out also and so I find at least that when you educate the patients before the test and then you come back with the result that you're not clear currently how to interpret they they say okay I got it you said that might be a possibility and how can we move move forward and understand that result more clearly as we go go along so but it does take a lot more I would say counseling of the patients and their families in the clinic both before the test and after the test comes back to to provide the best possible outcome and I think the patients in general understand that the pace of research these days is moving along very quickly and so a result that we're not able to interpret today we may be able to interpret much better a month from now or two months from now or three months from now because the pace of discovery is going so quickly that one of the first reports of the Baylor clinical genetics lab made the point that the solution the diagnostic rate was about 25% but of those cases that were for which a diagnosis was made the diagnosis involved a gene that was shown to be a disease gene sometime in the two years prior to the time the paper was published so if you've done that test two years earlier you wouldn't have known what to make of that result at all so the pace of discovery is really moving along quickly so one thing that we've started to do or have been doing on on the mouse side and this is the comp and I'm PC it's it's in frontiers in the EU is is getting involved in the care for rare in the rare disease networks and I know next year is the campaign for European year for for rare disease in 2019 and we're part of that how can we leverage what we already have for enthusiasm for activities to reach the clinicians to reach the clinical investigators if you will to what what meetings who are our customers and stakeholders here can you help us to to understand that around the rare and how can we better explain ourselves position ourselves and work for for your benefit I was going to say I think most clinicians at least the geneticists are well aware of mouse phenotypes being available they probably you know like don't know how to reach out to somebody maybe that's one of the things where you know going to while having sessions at major meetings at ACMG and the other group that we are kind of living out here are the genetic counselors the NSGC our genetic counselors do a lot of counseling and they're also very involved in figuring out things for our patients so NSGC may be a good place to also advertise on you know like there are these phenotypes available and there is such facilities available yes so I just wanted to sort of revisit this idea of outreach and and ontologies and wondering be there are so many different human ontologies being out there and I know Damien and his team have been doing some really good work trying and the Monarch Consortium have been doing a really good job trying to to crosslink these and and to make our data more accessible to to clinicians and you know as well to patients because what they hear is they hear the human terms of that their clinicians and their genetic counselors use I'm just wondering if there you can see a role in how how how willing do you think the general community beyond the consortia are to get involved in sort of actively helping us to sort of get our data out into the purview and make it more accessible to genetic counselors and patient groups and and others who could potentially benefit because as you mentioned before patients can be their own best advocates so in a digestible way though because I think often what what we put up there is digestible for us but not for someone who hasn't spent their life their professional life doing this sort of thing you know I think I have no data on this so just just my opinion I think if you went around Johns Hopkins and you said what are your thoughts about comp just my random colleagues in the hall most of them would have never heard of comp they would know nothing about it so the genetics people do know about it but the rest of the clinical service even in the Department of Pediatrics doesn't know about it but what they will know is when you publish papers I think going to meetings is great talking at meetings is great but I think all I hope all physicians are aware of PubMed and use it widely and and so I think you had we we were all admonished to publish and and it's just really important to get the data out there and the PubMed makes it really accessible no matter where you publish so just getting it out there is important and kids first so I was sent a lovely slide which which you can't see but but I can and because I was asking at the break who has helped to break the ice here and for speaking mouse and speaking human and kids first has done some a wonderful work to start that monarch has done some I did a screenshot from some of the monarch initiatives where they're breaking through to the anthologies and if somebody from the panel the audience can speak to those efforts because if they already have a headway then we probably best to to follow that if those are accepted ontologies from the collective here may offer an answer to the question about data dissemination on feeds into the ontology issue as well a Google launch an experiment three weeks ago called Google data set search some of you may have heard about it some of you hearing from first time but I believe in few years this will be the place where either your data is seen or resources or not and this is actually linked to the data technologies in action that will allow one to actually annotate one's resource whether data patient etc using metadata ontologies and such so it can be searched in intelligent way this is kind of inverting what Google was really originally about which is searching unstructured text html and the evolution actually was gradual until this point in the sense that Google asked webmasters to contribute more and more structured content using link data technologies but now with Google data search they actually flip it completely they say we will index whatever you tell us to index but you give us metadata in this link data format and the searches will actually show show you show your research so anyway this is my projection that this will be actually transformative and it gets the stuff where it's a publicly used resource I mean I Terry talked about this what the hit rate of PubMed is a thousand times higher than our hit rated a million times higher I mean it's the holy grail you can get into PubMed people will find it so hence you have to publish hi I have a question slash comment that's for Melissa but for the panel as a whole and it seems to me especially after hearing Melissa that there are certain areas of human disease and disability that are really being underrepresented by the comp phenotyping pipeline in a particular how do we evaluate intellectual disabilities and mice who won't take IQ tests and I'm I'm just wondering why not take a bunch of genes and some of them must have already gone through the pipeline why not take a bunch of genes that are known to cause intellectual disabilities in humans and then just ask what are the pleotropic effects or what's the brain pathophysiology or what are the behavioral assays in the mouse that might actually reveal a relationship to human intellectual disabilities it may not be as hopeless as it appeared I think that's a terrific idea and what Nora what Nora would counter with I'm sure Nora would say do imaging right yeah yeah no I actually I think that's a really very nice idea and I you know none of us is not even thinking that we're gonna have a single IQ test that's going to be useful for mice to help us understand their level of cognition most people would even argue it for human beings that the standard IQ tests are flawed so you know let's get out of this mindset of we're gonna have one test I think what we have to look at is functional domains and I think the folks at NIMH have a very interesting approach to this are you all familiar with our doc is anyone I know that NIMH is probably not as well represented here at this meeting but they're really trying to take these really complex behavioral phenotypes such as you know anxiety and depression really break them down into their components and not relying so much on the diagnostic and statistical manual for mental health conditions the DSM 5 which has been sort of the standard for diagnosis of mental health conditions but essentially what I'm suggesting is that there may be some sort of parsable entities that are domains of say executive function that you could find something comparable in the mouse or attention that you could find comparable in the mouse that might actually map on some of these complex human cognitive and behavioral phenotypes and could actually give you some traction in trying to sort out which genes and which phenotypes are linked to those genes and really start to make some progress in the intellectual disability field to use the genetics to help understand the phenotype in this case to do it by I guess so I'm new to this discussion so but for me it sounds like the question is what is the question so I think that I understand the utility of oh I found a patient in somewhere in the world can somebody please make a mouse or I can diagnose them but from my perspective I'm sorry to be I mean you know I use multiple organisms the mouse is not the correct roots because it will never scale ever every exon will derive a minimal of after day's point from very high quality bioinformatic analysis we're going to still end up with two to five can the genes and that number will stay plateaued for a very very long time especially for the denovers that will keep hoping up so I what I'm thinking though is that there is amazing opportunities to actually prioritize sets of alleles that will be useful to the community not necessarily for diagnosis but for biology so you will never compete with the zebrafish model where it costs 1500 bucks to test an allele you know it's just never going to happen but the zebrafish model sucks at consistency it sucks at looking at progressive disease states it sucks at generating enough biological material to do transcriptomics to do metabolomics to do other interesting things and I'm speaking of the zebrafish because this is what I know but I think the same the my my fly colleagues and my worm colleagues will say the same thing so I think it's not you know I think it would be I mean you know the the knockout project was amazing because it answered also biological questions and you know questions of tolerance and genome plasticity it might be useful to think about in terms of what is the next really big question impact it's going to be right so I heard earlier on today about the variable effect of nonsense mutations on transcriptional regulation splicing this and that this is a fantastically important question that we're going to have a very hard time asking in humans not least because we don't even know where the nonsense allele in a fibroblast which is all going to get from a from a human or maybe blood will have the same misplacing effect in the nervous system which is what we really want to know well I think this group is is is supremely positioned to answer this question I also think that there was mentioned about laminin right and the spectacular you know disease on this well these are some of the alleles that might well be made because there's a very deep biology that could be gleaned from that and so on and so forth the last thing to say is against this sort of proselytize you to sort of my view of the world a little bit is that there's two other questions that I think comp should consider answering then I'll shut up I promise the first one is the species specific activity this we've got to face it's a major bias both positive and negative depending what our starting point is and again I think you have the setup of the tools to actually answer this question in an absolutely comprehensive fashion we should be making some of this is symbol double mutants and asking the question the second thing is is a little footnote that somebody said early on remember seven percent of of children of individuals are going to go clinical exome sequencing at the moment are two first but we do not understand truly whether these are two independent events or whether additive or they were multiplicative I think these are areas of of I'm gonna call it clinical biology that we really need to understand so these perhaps are some of the big questions that you might want to consider anchoring your next activity as opposed to more in which I live in our house yeah I mean we're sort of talking about finishing because we're halfway through the null allele on a single background and it's very doable in the end point is in sight and then the opportunity once we since we have this platform to pursue other angles but I think you're right we're going to take a clean slate look at things we're going to look at alleles variants complex genetic interactions we have to consider all the model systems all the experimental systems when we look at that but I think comp comp or you know mega mosque is what I'm thinking but could I ask a Nico question yeah just Nico many mis-sense mutations in fact I would say that the vast majority of mis-sense mutations actually have their negative effect if they are deleterious have their negative effect by disrupting protein folding and so the zebrafish of course lives at a different temperature than mammals so our alleles that disrupt protein folding in mammals do they also disrupt protein folding and to a similar extent when introduced into the zebrafish the only answer I can give you in that context is that when we look at specificity data as in we take mis-sense alleles that we know a detrimental to function in humans and we test in the fish we see the same level deleteriousness so I guess I'm not asking the question fully but all I can tell you that if they if they're acting as deleterious in a human because they're folding they're still acting as deleterious in in that system but I do not know whether it's because of folding or not but the false positive rate is essentially zero so a question for for the clinical folks here and but but for the human phenotyping where are you seeing the best place for that data to go so then we could be a fast follower if you will with the mouse data and instead of us looking for our own database to for for you to go to where do do you think the collective knowledge should reside is the one that you use now is it omen is it something else I mean where should we be looking yeah all right so I think I have a one kind of a very general question which is you know phenotyping is the bottleneck in terms of imbalance right genotype phenotype with whole genome sequencing we have much less phenotypic information second is we have problem matching in our model phenotype with human phenotype shouldn't this bottleneck be addressed by now turning to molecular phenotypes as possible solution and let me well of course you guess my answer right my answer is yes and I'll suggest to you know one is epigenome which may give us a clue about effects of regulatory variance in terms of inducing allelic imbalances in the same way in the model and human which is a kind of ideal control experiment where we have locals in a heterozygous state within the same nuclear environment right in the same cell and we can observe whether we can recapitulate say effects of variants on changing local methylation in cis as a signature of transcription factor binding effects and such the other molecular phenotype maybe metabolom right because it's the first one that has entered clinic it's now in wide use by major projects so if you could match metabolic perturbations so multi metabolite perturbations in humans to those in mice we can actually maybe arrive at a conclusion that some of the phenotypes that are not recapitulated sufficiently at the whole organism level may in fact be recapitulated at the molecular phenotype level and in that way validate our models and gain insights also in the mechanisms you know the molecules that mediate the effect on the ultimate phenotype this this is very important also for complex diseases that are quite heterogeneous at the organism phenotype level and may allow us also the side effect to dissect these more complex phenotypes yeah I mean I think if we had the money and technical support to be able to do RNA seek on our knockouts it would be fabulous I mean to wreck links to the links program that it's just a fungible digital Bitcoin can take you a lot of different places I think cost is an issue but they are now as they say metabolism and son and maybe targeted epigenomics which actually are becoming quite affordable actually one can argue cheaper than complex phenotyping of humans so that the first all let me say that I love the idea about using the metabolomics profile as a broad and sort of one stop biochemical data for each patient there are published data showing that the closer you get to the gene the more penetrant variants are and the more easier it is to see the genetic effects so that would be all to the good there's a lot of biochemical data already out there but it's not organized into one single test so you have to pull it together from all different places and and that takes time and money and so forth the thing about phenotyping is that it's very it it's it's important in my view this just a personal bias to do it in a uniform way in a way that's searchable and recallable and that way is not the medical chart as the medical chart currently exists if there was a and if there was a program that would go to it let's say epic and pull out the data in an organized epic is the electronic patient record at most of the it's currently the number one one in the country it's completely opaque to many users but if if there was some way to go in and interrogate what's in the epic chart and put it into a very straightforward accounting of the phenotypic features that would be extremely useful we in phenode B we record the phenotypic information in a very standardized way it's very quick to score it means that we score every patient using the same queer set of data queries and it's very it's a very robust way of evaluating the phenotype of not evaluating but of scoring all the phenotypic data but we don't have we have not built a tool that will let's say go to epic and extract those data out of epic those those data are coded though right as opposed to a DB term it was just clarify I was not actually referring to phenotypic or EHR date I was referring to actually measurements that can be done comprehensively on hundreds of metabolites and so on that may be a compliment to what you're saying but it's actually a different approach yeah pulling data out of epic is probably illegal anyway the only comment I have to say about the molecular genotyping is finding a gene or when you do a molecular analysis is going to be effective only if your phenotyping is good because of the unknowns in the genome let me just add one additional advantage of molecular phenotyping is that in fact phenotypic information can be exchanged freely without danger of the identification right so you can publish all the phenotypes and have a matchmaker right on molecular phenotypes without any concerns because metabolic profile is not identifiable now by chemical perturbations you know by chemical testing lab will tell you that you know they can read these metabolomic perturbations and find all kinds of things about nutrition disease and so on so I think that it depends maybe on the domain if you're too far away from biochemistry then maybe this may be a not familiar territory I think I know that Baylor is probably moving fastest with metabolomics and I see art behind you there art are you using artificial intelligence to it to score the metabolomic profile that's a seems to me to be a perfect opportunity not to my knowledge well I should disclose I do have a research project with Baylor genetics where we're doing exactly that you know detecting perturbations at multi metabolite levels so it's five o'clock and some people have to plane so if you need to leave feel free to I just had a quick comment about the use of electronic medical records and one of the mechanisms that has allowed Stephen Kingsmore to have these 56 hour turnaround times in some cases for his rapid whole genome sequencing of ill newborns in the NICU has been an approach that uses natural language processing and AI approaches artificial intelligence to extract meaningful data from the medical record it's not just looking at HBO terminology but it's actually a fairly complex algorithm which I know I couldn't begin to explain to you but has shown to have some remarkable efficacy for actually being able to home in on what are the relevant genetic variants that might explain this newborns medical problems and phenotype so I think there is some emerging work in this field and I'm sure there are others who are doing similar sort of approaches because the the MR systems are as you say opaque heterogeneous and bulky to work with so we have some other strategies we think we have to use so I have a question just on common fund synergy is there an effort to combine the data sets from the different common fund you know just as thinking just common fund efforts because one advantage of that is for all the money invested by common fund over these years at least it's protecting and integrating those combined efforts there are a lot or related there's rare there's personalized there's I am PC comp is there a common fund ideal on how to preserve and retain for durable impact these collective efforts well do we have anybody from data commons I don't you mean here today no I think so we might have a PI anybody involved in the data commons we do have an effort within the common fund called the data commons and it has a new name now which I'm not even going to remember but it is to basically provide a place for not only the common fund data sets but other data sets and allow integrable searches across them and and also for them to be used on the cloud and my understanding is that the the first three sort of pilot data sets are the model organism databases top med from NHL BI GTXO thank you which actually used to be one of my programs so and also kids first is sort of latching on to that and my understanding as you again is a little bit further down in the pipeline so we're trying and also the microbiome project is is you know is poised to go in so it's something we've been trying to address for a number of years I think now we're just just actually starting to make progress and doing it because we've made a recommendation at least the advisory board did that the IMPC comp effort have a dedicated working group to looking towards that to meet your needs while we're still in phase two and because we want this effort whether we are at 8,000 20,000 genes to detect datasets to be able to integrate and have impact for all the common fund efforts for for 20 years right and I guess one way that I'm thinking about is if you're if you're already working with kids first and UDN and some of the other programs you know you can all sort of go together at least have some some things and some strategies in common but the other thing to say is that our office is hiring somebody to a person a full-time person to actually work at reaching out to the programs that aren't in the in the first three datasets so again it's something we haven't hired that person yet but it should be happening within the next few months while we're on this topic I think everybody in the room recognizes that exact and Nomad have been a tremendous resource for interpreting genetic data and we need a similar resource for whole genome data we have really quite puny whole genome data we heard today I think 125,000 genomes in top med are those accessible in an exact like Nomad like database sorry thank you thank you for reminding me to use the mic the data itself is available through DB gap for 55,000 of those the rest are accessible within the program and the informatics research center has done joint calling but I don't know just how much of an answer I can give you today as to how grand I was going to say I mean Gonzalo Abba cases group is creating a Bravo server and the Bravo server really is equivalent to Nomad in terms of being a variant level server for that whole genome data so in addition to the things you were just describing that is one of the efforts that's coming out of top med and do we have a timeline on it so Bravo is currently publicly available now you can just search Bravo top med I mean just need a Gmail login you can log in and and search variants there's a lot of really great data in there can't not not really recalling up to off the top of my head but I would encourage you to explore that we're also it's turned out to be a little more complicated than we would like but there is also an imputation server with all of the top med data there are a lot of policy issues with sharing that publicly but we're also working to hopefully get that made publicly available within the next year or so and I'll also mention that the data commons has a portion of it dedicated to top med that we actually call data stage and that is meant to provide both cloud resources and storage but also tools and analytics publicly on the cloud in a way that would make this sort of thing more available to researchers that don't have the really heavy infrastructure that you would need to do some of this really high throughput whole genome analysis it's coming yeah so concerning the data integration for at least in Europe for the IMPC data there are two possibilities that I'm aware of so one of them is to have the IMPC database as one of the core data resources for the bioinformatic research infrastructure called elixir right and the other one which I'm aware of is that there is a project called the European Open Science Cloud Life which is a collaboration of different research infrastructures in life sciences and it would be possible to nominate IMPC into incorporating into this project to have a yeah to preserve the data there sure no problem I can do that well I think we've had a full day so so one last question or going back on on data set so as a clinician when you try to find data do you go to Omen or what's your gold standard because if we want to take a look at the ontologies and how that's being staged then that would help us go to where you like to go I think any practicing clinical geneticist has a computer open to Omen in the clinic constantly okay they can't they cannot really do their work very well without that also any practicing clinical geneticist is reasonably facile with exact and nomad because they they get a report on the sequence from somebody there'll be some you know they go through the first they don't look at the first three pages they go to the bottom line and there's some variants listed there and they want to know how frequent those variants are and they know they can get those data from nomad or exact you know immediately and and they for the most part the residents the fellows the clinicians know how to use those databases without any problem so for common fund data sets if we looked at those and tried to break our mountain to do that now for a five-year plan at least we're combining some of the key the key foundations so if we consolidate on one side in the translational model and and there's consolidation on the human end then that may be a smarter way to go than all the bits and pieces trying to be be formed in separate spheres right and the other site that's open is PubMed in the clinic all the time and you because in a genetic a typical genetics clinic you never know what's going to walk in the door and you practicing geneticists is someone who is comfortable taking out a patient that they have never seen a patient with that problem before in their life and so in the minute or two before they go in to see the patient or the day before when they learn that that's what is coming they're doing a quick study to see what's known about that okay so that's where they go in their deep dive and then just specifically for Omen on some of the known phenotypes or associations yeah yeah and yet I mean the other thing is gene report what is that our gene gene reviews right excuse me if I can just interrupt I didn't hear you say anything about that the clinic clinician uses Clinjan is that a useful resource they do they do I'm dating myself a little bit here okay yeah because the that's that's another potential interaction to work with that consortium find out what functional data they need to complete a review right yeah I only see follow-ups now so all right so what I'm hearing is kind of recommendations about how to build these systems and son I would like that you know the recommendations be what the clinicians are experienced with is interfaces right the web interfaces they're enormously important actually but we have a different layer design is underneath it using API is linked data technologies and such so whenever recommendations are made they should take into account both the technical perspective and user perspective if we go off balance will be building silos we'll be using approaches that are 50 years old since the era of you know airline reservation systems but times are changing rapidly on the web and so we should be thinking about how to employ the latest technologies so that layer underneath it right actually scales to the diversity and volume of data and accommodates a multiplicity of contributors right who contribute their own pieces of the puzzle and I think fairness you know FAIR is a new set of principles that's been promulgated by the NIH it's been enormously useful actually as a way of bringing up the technical aspect of the actual implementation of systems so fairness actually talks about that layer underneath the user interface and says you not only need to build a silo you need to build an interoperable system which means it needs to talk to other systems to computer programs who can have their own interfaces so-called API's so you can share your data through that and findability actually is the key one which means you need to use certain dignifiers you need to have metadata about identifiers and expose it for search by search engines such as you know BioCADD data med index built in NIH for example a fair share but now things are really changing with Google data search why because Google data search if you follow these principles in the past few years you're ready for Google data search why because they're asking for the same thing and all of a sudden this ecosystem is growing so what I'm saying is we should be careful about the actual implementation right and take into account these trends and so build interoperable systems that interoperate not only with each other and within the small biomedical research world and I'm saying by small because I'm having data search in mind where there's all kinds of additional information there but also interoperable with these broader trends which actually will dictate technologies going forward yeah I mean the IMPC data is on OMIM but it's buried in a sidebar in a pull-down menu where you get a link out and that's not effective I mean what should be there is a synopsis of our findings right boom right in front of you speaking of that you know this design principle makes it possible for multiple parties to build user interfaces on top of the same underlying infrastructure so the benefit for the end user is that there could be multiple interfaces catering to different audiences right so that their needs are maximally met and there isn't a single portal or single interface because it's very hard to meet the needs of all users right of OMIM and SON so this design will actually in in the end benefit all the audiences as well I have minimal influence on OMIM but some and I will do my best okay on that note Dave's gonna solve all our problems for us we're good we can leave and we have a volunteer so I don't want to take up any more of your time we're over time I think we've had a very full day we've got a lot of work we're ready to do this coming year supplements are in place plans are in place excitement is in place we'll meet again here next year and hear about a lot of progress that's been accomplished so thanks for your time