 So, you may hear some common themes across certain talks here. We all hear the same pain and feel the same pain. So my perspective on this comes from running a clinical lab for the last about 10 years. When we do genetic testing for various diseases, rare and somatic cancer, various things, and most of our testing is in Sanger sequencing and now in next-gen sequencing. So we discover lots of novel variants through that testing and so we have a fairly robust process to evaluate variants that we have to write into clinical reports. Before we got into full swing of next-gen sequencing, we were doing about 300 novel variant assessments a month and to date have curated 25,000 variants that have been put into clinical reports. So we do a lot of this and I have to hire a lot of staff to do it. In fact, if you come within a three-foot radius of my lab, you sort of get sucked into the vortex of novel variant assessment in my lab and all the rotating residents and fellows and everybody does this. And we even time, you know, how long it takes them to evaluate a variant because we have to staff for this in a very robust way and hit our turnaround times for clinical testing. So we time them and it takes about 22 minutes to evaluate a variant that's never been reported or seen before and it's just a matter of searching every database we do and doing a silico assessment and then if there's population data, it takes 25 and if there's publications on the variant, it takes on average two hours depending obviously on how many papers. And that's the first line process that the mainly our PhD molecular fellows go through. And then our genetic counselors draft a report with the variant information in it and put it into a clinical context for the patient. Our geneticists review the data, sign out the case and for somatic cancer pathologists add annotations as well. So that's our process and so every variant that goes through the clinical process, you know, first gets annotated with respect to all the data that we can find on that variant. We then bring in patient clinical information, custom knowledge that we house in terms of our own experience with the genes. And then we classify every single variant we report into one of five categories and clinical labs are sort of settling on these five categories with some sort of similar terminologies and writing those into clinical reports. We then often try and get family members back in for segregation analyses to bring back to the variant classification process because a lot of these are VUSs or they're likely this or likely that but not definitive. And so we try and solidify that with linkage analysis on an ongoing basis. So that's our process and we use a lot of tools to do that. We have developed, you know, just a lot of over the last five years been iterating on a tool to try and just capture all of that knowledge that we bring in, you know, use it in the context of the variant spectrum we already know about each gene. So we always have that automated, automatically drafted from our system in terms of what we've seen to date and seeing the technical data. We keep track of data on every gene so we know what spectrum of mutations have been seen in that gene. There are certain genes where loss of function mutations are a known mechanism of disease like in MIBPC3 and LAMP2, almost entirely loss of function. But then most of the other sarcomere genes for this case, hypertrophic cardiomyopathy, don't cause disease. And when we find them, we actually don't say they're pathogenic because this particular disease, the variants are gain of function dominant negative. And so you can't just say that because it's loss of function it's causing disease. So maintaining gene level specific knowledge is obviously very important and you develop that over time. For some genes you can develop domain specific knowledge where, you know, pathogenic variants are in very specific domains and that can help guide interpretation but you have to maintain that in some data source to know that and retain that knowledge on a gene specific level. We have a huge grid and you can only see the very top corner of this grid but, you know, hundreds of genes, thousands now go down one side and hundreds of databases go across the top and every gene has a different set of databases that we have to search and the fellows track that they've searched each of those sources for every variant that they find and there's a hyperlink to each database and so we check thousands of databases every day and then we don't trust any of those sources of data but we collect the raw data, the numbers of cases and controls and we fill out tables of probands and controls of cases and segregation and Donovo occurrence and presence with other variants for recessive or differently interpreted for dominant and track all this data to compare case and control. We do all the in silico analysis, albeit often not that useful. Homology is probably in my mind one of the more useful things and capturing that visually which is important and in silico assessment of splicing and the story goes on and now that we've gotten into whole genome and exome sequencing and we're finding genes we've never even heard of, we have to do a novel gene assessment and where is it expressed, what pathways it exists in, what's known about the disease associations, model organisms found and any other literature on that gene that puts it into some biological and clinical context for us as we evaluate potential variant. And then, you know, at the end for each variant, we actually write up a descriptive evidence-based summary of that data. In this case, our fellows first pick a category and then write their evidence of that and then the geneticist comes in, says why they disagree and rewrites it but that enhances the learning process so they do it better the second time and then there's a hyperlink, we send the feedback to the fellow and they do it better the next time and it's actually a very robust learning process and ongoing process in the lab. And then these variant interpretations, one of the residents was so enamored with the amount of data he had to take a picture of his desk with four different computer screens to track all this data, you know, it's just, that's what we do all day every day. Despite the fact, you know, and you have to do it well each time because, you know, and here's data from testing 3,000 cases for hypertrophic cardiomyopathy where two-thirds of the variants we see, we only ever see once. And so you can't just wait around for data to be published and someone to figure this out. This is often your only shot, we have to figure it out, we have to recommend the right segregation or other analyses to figure these things out. For here, in loss, which is a recessive disease, of course worse, 81% of the variants that we see are unique to date. So, you know, this is just the day in the life of the clinical lab to try and put these into some context, even though most variants, as we all probably recognize, don't have enough data to be definitive about it. You know, one of the single most common variants for hearing loss in children, M34T, is in the human gene mutation database erroneously as a dominant variant because the first publication was wrong. And now there's been subsequently over 500 publications on this variant. It's still wrong, you know, and that issue has been pointed out. So, you know, this is a challenge, the literature doesn't change, we've just heard that conversation, and we need to figure out ways to really update it. And, you know, this is now going through genomes, and every time we all run these genomes through and align it against the HDMD, our first case was 42 pathogenic variants. Of course, when we evaluated them through our clinical lab process, none met criteria. And that's, you know, that's the standard thing we're doing now. In most cases, most of those variants don't meet the criteria for pathogenicity. This is going to be a common theme that we all see. So we really need to all, as a community and a group here, come up with what those standards are. And I think it's not always, and I think Mark actually sort of pointed this out, sometimes the paper doesn't necessarily specifically say it's pathogenic, but there's this implication when you list it in your table of, here are the variants I found in these disease patients, and then HDMD enters them as pathogenic, you know, because that's what the paper said. And it's not that the author really said that, but there's this implication, and I think it's incumbent upon us all to very specifically state and list our interpretations individually to say these are variants of unknown significance without just sticking them in a table and saying, well, I'll let somebody else make that judgment. Because that's, I think that's been one of the problems in our field, is that we haven't taken a stance on a variant and really individually evaluated and written our thoughts about that variant. And what's happening now is that clinical decisions are being made on this data, you know, $20,000 cochlear implants being made for, you know, patients with predicted Usher syndrome. Some of you may have seen the news on Monday from the, you know, breast cancer patients who are either not having prophylactic mastectomies when they should because of errors or having them when they shouldn't, you know. So this is happening in real time. This is the very common story in my lab where we have, you know, domely inherited cardiomyopathy, we test a pro-band, we find a variant, and then they're using that information to make decisions on who's going to get monitored in the family and who's going to get an intracardiac device. You know, this family right here, this child died at age seven due to hypertrophic cardiomyopathy. They put a device in this child because they tested positive for a variant. Of course, that's assuming that we knew what that variant meant. And then this side of the family is not being monitored because they're negative. And so we have to get it right or people are dying because of the decisions that we're making. And, you know, here's a variant in a very well-respected locus-specific database, L26 valine. This is published in four different papers as pathogenic. It's been seen in 10 pro-bands, absent from an 832 race-match control chromosome. So most of us would say, well, it's not enough evidence, but it's pretty reasonable evidence in a clinical setting. Most physicians would actually, you know, assume this is pathogenic. But then, you know, over the last few years, we've accumulated a lot more data. And this long, text-y thing is not meant to make you read this right here, but it's really meant to illustrate the point that there's actually not a single item of data in this long story that on its own is sufficient to say something definitive. But if you really were to go through reading all these arguments, we make an argument that each about ten different reasons and we actually think this is wrong, and that this variant is actually likely benign. But, you know, through a structured set of data that might get dumped someplace, you'd never actually come to this conclusion independently with any of these pieces of evidence. And my argument here is we really need to document, unless I've talked about this, document your logic and why you came to a conclusion so that the next person who comes along who has to evaluate the same amount of body of evidence and say, yeah, I agree with that argument, or no, I don't agree with that particular argument, or now I have some additional data that was never mentioned in the story and now I can add or refute the story from it. But if you don't see somebody's argument, it's never really put into a paper as to why you made a conclusion. It's really hard for the next researcher or person to come along and add their own flavor to that. So we document all this in text as much as it's not computer-friendly, but those are helpful stories to really document why we think what we think. And I've been tracking how often we change categories over the years. So for a five-year period, I looked back at our HCM data and we had made 300 category changes based on new evidence evolving on different variants. Over half of those we would consider changing clinical care management based on those changes. So this stuff is changing all of the time, more evidence to either support or in many cases refute what was published before. And we need ways to get that data out there because it's often not sufficient to get a publication from that. We have developed a clinical system to deliver that information to physicians and that works quite effectively. As soon as I make a change in my database, it's perpetuated electronically to the physician. They get an email, et cetera. But that's only happening within my closed clinical environment. It's not yet getting out into the public domain and we need a place to put this data in the public so researchers and anyone can get access to it. In this case, the physician can go in and see the text-based description of what that evidence was and they get an email alert alerting them that they have a patient with that variant change. So we get that, no problem. Now, as I said, these store each change and the main argument, one of many arguments I was making for why this variant is likely benign isn't in 1.5% of the Chinese population. But as we all know, the control samples in the population studies do have disease and we just can't document it. This is a variant we reported as benign for many years. It's in 4% to 8% of South Asians. But when a case control study was done, there was an odds ratio of seven showing that this conferred risk for heart failure and in a homozygous state, it's actually quite severe. And then we got a case with this variant. We then changed it. We amended all the reports across many patients to say that this was a pathogenic, albeit mild variant. And then we had this case where a proban was homozygous, very similar to what was then published later. We assumed that this disease and this child who died of cardiomyopathy had it due to this variant. Parents were unaffected, but they were heterozygotes and it's mildly, you know, penetrant in adults. They did prenatal testing. That was negative, not even a carrier. And they gave birth to a severely affected patient child with cardiomyopathy. So clearly that wasn't the only variant and certainly wasn't the primary one. And we still don't know the story that's going on here. You know, I hope to figure this out at some point. And I'll stop there. You know, it's just the sort of some clinical anecdotes in terms of how we do this stuff. Thank you very much. So comments, questions for Heidi? Is it possible to imagine in any clinical setting doing some biology on variants you haven't seen before? Because, I mean, things like predicting splicing potential, you know, that's well known to be a pretty poor guide to what happens when you actually put the mutation into a cell and ask what it does. And so in areas where you're going to keep seeing variants you haven't seen before, obviously that's a really important source of information. Can you imagine that being part of the clinical program? Yeah, I mean, it's a great question. And at some point there was actually a company that had developed a service to assess splice variants. Or maybe they contacted me about would you use this if we had this as a service? And I said, yeah, I actually probably would. You know, we tried to do this on a few variants through a collaboration with, you know, our research colleagues. And I actually did it on one particular family where we had a variant that I was highly suspicious of and managed to get, you know, you can't do it on the sample. You've got to get a new sample because you've made DNA out of your sample into RNA. So, you know, we tried to do that once and we actually succeeded, but it was an incredible amount of work and just not something that was easy to put into a clinical workflow. So I think you really need, you know, a close collaborating research lab that can kind of do that stuff. I wanted to make a quick statistical argument because frequently reading papers about known disease genes, you see that this variant has not been observed in 500, 600, 700,000 controls. And I think the mindset is very different from research project. If you have a very good candidate gene, you have a single individual absent in several thousand controls. That's one story. But in these genes like BRC1, BRC2, possibly cardiomyopathy genes, number of sequenced cases vastly exceeds number of controls available in databases. So for BRC1, BRC2, I believe maybe 60,000 to 70,000 people have been sequenced. So it is very likely that neutral variation would be seen in large amounts never observed in controlled databases. So this argument is not valid in the clinical setting even though it's maybe somewhat supportive in the research setting. And I think that's why we have to do segregation studies when we see these variants because that's our only shot really in many cases of getting anything useful. But even when you see segregation, that could just be an LD with another variant. So you have to be careful. It's not, you know, 100% proof. And also some of these variants that are in these databases, they did come from large families, but they weren't the actual causal variant. They were obviously an LD with a causal variant. Yeah, that's absolutely true. Although in that individual family, whether it's the real variant or not, you often can at least use it as a marker, right, if you believe the phenotype and the locus. But you're right, you don't, you can't really say that that variant is the causal one. No, I think it's going to amplify, I assume you'll make a great point and I think the only way we get past that is, you know, in single families, they're never going to be large enough where this, or they're often not going to be large enough where the segregation is going to be conclusive, as Suzanne points out. If we as a community commit to, you know, developing a community variation resource of 100,000, you know, exomes or genomes or maybe even larger numbers than that, that begins to, you know, get us to a point where, you know, we can, you know, again, make some use of statistical arguments in a more compelling way and get past that particular bias. I'm struck by something that both Mark said and Suzanne just said and I think if I remember correctly, Mark, you said we can't make mistakes when you're talking about these problems and Suzanne, you just said it's not 100% proof and I just want to sort of, I think we need to do a little bit of a reality test here with respect to especially the, what Heidi's talking about is if we are anywhere thinking that that's the standard that needs to be operative for clinical decision making, we need to shut down most to all of clinical medicine that's being practiced today. Right, because almost all of the evidence that we work with clinically is less than 100% solid and clinicians use imperfect data every day to make decisions and we don't want to block their access to data just because it's imperfect or it might lead to a mistake. So I'm not talking about saying we're going to just lower the standards because everything's okay. I'm totally in agreement with your presentation but we do have to remember that good data are better than weak data. Weak data are better than no data so long as the person using the data understands the limitations of the data and can put it into context that they do know about the patient and have access to the full picture that Heidi was just talking about. So the last point you made is a very important one what you do know about the patient and I think there's very different standards that need to be... I mean, first of all, I expect you didn't actually take my statement, some linguistic license with my statement saying we can't make mistakes, we can't afford to make mistakes as suggesting we should shut down all clinical activities until that point in time where we will absolutely make no mistakes. That would be a particularly foolish statement. I think there is a very different standard in two instances which are of clinical relevance and I think one of them we feel like in the last case it's much more problematic about drawing inferences which is in making decisions for children or reproductive decisions in the case of evidence where we do not have any clinical presentation whatsoever and it's a very, very different situation I think in the interpretation of patients who do present with significant, you know, aberrant phenotypes in how we interpret that data and I think it's right that we interpret those very differently and I think we probably need to have different standards for those without question. So last comment. Well, I agree with Les' comment. I think the genetic data is the public gets the idea that has the idea rightly or wrongly that genetic data is more precise than other clinical data and I think that I would underscore the person who transmits the information understands the reliability of the information that they are transmitting and there are not very many people out there who understand that and lastly having gone through this with a family that you might have judged were fairly unsophisticated last week after I spent an hour talking to the family and explaining the uncertainties, the father said to me, Doc, can't you do better than this? Indeed. Last, last short comment. Just a comment on Les' so while I agree with the sentiment I actually think the real problem is that in genetics there's a lot of claims out there that don't even distinguish the variant from a randomly selected variant and that's what we need to worry about. You can see that really clearly if you just take the totality of the schizophrenia associations which Mark took one before the GWAS era and then you take all of those and put them into the GWAS studies I think they all evaporated. So that entire body of literature didn't distinguish those variants from random selected ones. I think it would be fair to say if we translate Mark's linguistic license and say we cannot afford to have that kind of nonsense in the literature that's a fair statement.