 Thank you for having me. So I was asked to give some of the laboratory perspective. My background is as a board-certified clinical laboratory geneticist, and I direct the laboratory from Leica Medicine at the Partners Center for Personalized Genetic Medicine under Scott Weiss's leadership. And we are in the process of developing a genomic sequencing clinical service through our CLILAB. We're working closely with our genetics clinics, and Mike Murray from the Brigham is here in terms of the upfront and return of results environment. Our main focus in terms of the expertise that we have and our strengths is really focusing on the data analysis interpretation piece, and initially we will outsource the technical component for the whole genome sequencing, although we are developing the exome sequencing capability in our clinical lab as well. So most of our effort has been surrounding the enormous computational challenges surrounding variant annotation, filtration, and really although we've done evidence-based variant assessment for 10 years in the context of doing clinical sequencing, scaling that to address the genome is of course another order of magnitude. We're also working closely using our gene insight software that we've developed over the last eight years to support clinical reporting of sequence data and expanding that to support genomic interpretation. We have a working with a U01 funded grant that Robert Green is the PI of called MedSeq where we will sequence 100 whole genomes through that effort, half with cardiomyopathy, half with healthy patients, and are really working on the development of the general genome report in terms of general information from every patient as well as disease-specific reports looking at primary indication and using orthogonal confirmation of those results in the CLIA setting, and also working closely in terms of integration of this infrastructure into our EHR. And I'll talk about a number of these pieces, and I really want to broaden this not just to the things that we're doing but issues that are common across many groups that are working in this space. And I sort of put my punch list of some of the key challenges in the clinical implementation of whole exome and genome sequencing on this list. I think we're all aware that sequencing technologies are changing very rapidly, so picking a time point to implement a certain technology is challenging, and that was one of the reasons we chose to initially outsource that component, work on the interpretive piece, which is really the most challenging part, and then bring the technology in later, at whatever point it had evolved to. The computational requirements are obviously unprecedented in terms of large data sets we're dealing with. In the clinical implementation of this, it's not yet ready to just use the data as is, there is still a need to confirm the results, and unfortunately they all can't be confirmed by one method, depending on whether you're talking about point mutations, copy number variance, et cetera, or low-level variance and somatic variation, and so different approaches to how we think about the confirmation process, hoping that at some point we won't have to do this, but it's still there now. And also as we implement these technologies in the clinical setting, we need to both address the fact that there are existing very high-quality tests that are targeted, yet we also want to take advantage of the rest of the genome, and so how do we balance that in terms of the fact that the genomic approaches don't have the level of quality that our targeted tests do, and we still, we want both components of that, and I'll talk a little bit more about that. In the CSER meeting that some of us were in, we spent a lot of time thinking about secondary findings, which are appropriate to return, which are not, and that's a big area under discussion. Updating results over time, you know, patients ideally may only get their genome sequenced once, but the knowledge changes. How do we address that? And I'll talk a little bit some of the strategies we've worked on, and probably for me, the biggest challenge is that human variation is enormous and rare, and understanding the phenotypes associated with it will be challenging, will require great structure and data sharing, I think, and I'll talk at length about some of the strategies we're thinking about. I do want to point out that the ACMG came out with a policy statement surrounding the use of genomic sequencing, it's on the website, and I'm not going to go into all of the detail here, but I will say that the board has been fairly forward-thinking about embracing this in the clinical arena, not only for diagnostic serene environments, but also even in screening, recognizing the use and preconception screening, and even at healthy individuals if there's a high threshold for what you return back to the patient, though not obviously supporting it yet in a prenatal or first-tier newborn screening environment. A number of recommendations in that just four-page statement that came out. One thing that will be a theme for me is this last statement here, lab should share genomic data into public databases, and I'll talk a little bit more about efforts there. There are two work groups that are diving deeper into some of the aspects of these topics. One is a secondary findings work group that's chaired by Robert Green and Les B. Sicker, and we've been working on trying to define recommendations for what to return to patients. I'm also chairing, along with Panar Barak, Toydemire, a standards and guidelines work group addressing the laboratory standards for how we implement both targeted next-gen sequencing, whole exome sequencing, and whole genome sequencing into a CLIA lab environment, and we have a draft now that we're still working through the last details before we start disseminating it to the community for feedback. One of the challenges in the clinical lab is we like to develop, and we have to for CLIA standards, clear SOPs so that every test is run the same. In genomic sequencing, you really don't want to run every test the same. Each family will have a different assumption about inheritance pattern, different approaches, different family members available, and different strategies, and that makes it challenging to implement this in a clinical lab where you really want to have very defined workflows. And as I mentioned earlier, the technology still isn't perfect. So how do we get the technology to a level that is appropriate for clinical efforts? In terms of genome sequencing, well, we can maybe supplement that with an exome to add better depth in the exonic regions that are most looked at from an interpretation standpoint. And some of us are thinking about supplementing whole exome with the clinical exome, those genes already known to be associated with disease, to have higher quality data for those critical regions, even using multiple technologies because each has different error, platform-specific errors, and all strategies to think about how to best effectively incorporate this into a clinical environment so that the technical quality is appropriate. In our lab, we do a lot of targeted next gen sequencing, and for every test we run, we fill in every last base with Sanger, if anything's missed, so that at the end, whatever the gene content of that test is, we have been able to cover it at 100%. Of course, that adds significant labor and cost to that process as we iteratively fill in with Sanger, as well as confirm all of the variants by Sanger sequencing. And although this is not a reasonable approach to fill in every last base for a whole exome or genome, many of us that are using this even in disease-targeted context still realize there's critical content that has to be there for every patient, so these types of strategies for targeted are also being implemented even in the whole exome and genome approaches that groups are taking. And of course, adding custom design for confirmatory testing is yet another added challenge to this workflow that is difficult in the clinical lab, but it's something we're working on for our whole genome approaches. The other challenge that we struggle with and have for many years is assessing the evidence of variants when there's often very little evidence there. We time our fellows who are the first here of assessing variants how long it takes per variant that we report out in a clinical context. If there's no data out there at all, it takes on average 20 minutes to search every database and look at in silico data and up to an average of two hours per variant when there are publications. This is obviously dealing with rare variation, not the level of complexity if you were to evaluate a GWAS type variant that has lots of literature on it. And we do about 300 of these a month in our clinical lab in terms of reporting out the clinical significance of rare variation that may be clinically relevant, and then that data is reviewed by counselors as they draft reports, the geneticists who sign it out for our somatic cancer, also pathologists. There's a lot of labor that goes into this process in evaluating the evidence for a variant before it goes on a clinical report. And the challenge here, I've been collecting some data on certain tests that we run. So this is for hypertrophic cardiomyopathy, one of our gene panels. We've tested 3,000 cases to date, found over 500 clinically significant mutations. Two-thirds of them have been unique to a family. So in many cases, we get one shot at this. And you can't just wait around for the literature to come out on your variant because often it's not going to be there ever. So this is a huge challenge in the clinical laboratory environment. People say, don't put it in a clinical lab until it's well-established. Well, sequencing breaks all of those rules because we find novel variation every day. In hearing loss, it's even worse. 80% of the variants are unique to a family with the data we have to date. So this presents a challenge, my variant data problem. We do have public data on variants largely in databases like DBSNP, the ESP cohort that's now available. And that's very useful for general population frequency data. But it's largely unannotated with respect to clinical relevance. The data that is annotated with respect to pathogenicity usually comes from initial research studies, these locus-specific databases that are out there. Unfortunately, a lot of that data is an error. Large enough controls were not tested. There's publications coming out that upwards of over a quarter of that data is just wrong in terms of its assumptions. So we really need to get larger data sets to make effective understanding of variation. And a lot of that data actually today, the best data that we do have access to, or I should say that is in existence, is in clinical labs and is not well-published or available to all of us in the public domain. So we have been working to try and come up with ways to solve this. It's a little awkward to talk about this because it's a grant under review. But I think the principles are still common to all of us. And so I want to talk about them as things we should all think about, whether it's our grants or other approaches or ideally both. So we need to come up with standards in the community in terms of even how we, the terms we use to assess variation, the rules to evaluate evidence for those variants and how we think about them. And bring data together. When this project started, there was a group that thought there should be a separate clinical grade variant database separate from the other databases we have. The challenge is most of the variants are on some continuum of knowledge and constantly evolving. And it's very difficult to say a variant should be is ready for the clinical grade database versus it's still in the research environment. I think everything at some level is still in the research environment as we learn more about it. So my hope is that all of this data will be in the same place. We in clinical labs use research databases all the time and vice versa. So the goal is to put it all together, get data out of clinical labs into the public domain out of locus specific efforts as well as the uncurated population data that's coming from large studies. Put it in the same place and then enable expert groups to through evidence-based and consensus models arrive at what our best guess is for those variants. And then that has a better opportunity to be used in the clinical environment than what we have access today. And the project that we had proposed we're working closely with NCBI to put this data in the ClinVart database so that we can ensure that it's in the public domain working with other efforts that are already working in this space but trying to expand it and consolidate it into one place. We are working closely with the ISCA Consortium that David Ledbetter initiated several years ago to try and get copy number variation data out of clinical labs into the public domain. And they've been very successful with over 30,000 cases to date in DBVAR accessible to anyone who wants that information. And that we've learned a lot from their efforts and we're now joining forces in a combined grant to try and expand this to include molecular data. And I was concerned about the willingness of clinical labs and even commercial labs to want to put this data in the public domain. It's often considered their proprietary data. But I've been pleasantly surprised that many, many labs have agreed to participate in this project. There's only three that have so far declined myriad prevention genetics and medical neurogenetics but many of these labs and you'll see commercial labs here as well have agreed to participate in this effort and are willing to put this data in. The challenge is it does require resources to get that data out of their systems to have it all structured in the same way and be able to put it in the public domain. And so that's something that we as a community need to figure out how to support when labs are willing to share this data. Another challenge that I alluded to earlier is how do we update this data over time? The clinical labs face this as an enormous challenge today with sequencing in the clinical context. And there's no billing mechanism to be able to reinterpret reports. It's just done for the few labs that actually do it. It's done free of charge in essence and that's not a sustainable model. We need to figure out how to do this more efficiently and the guidelines from American College of Genetics are that we should be doing this but most labs are not. So how do we update data over time? And I've been looking at our data over the course of a number of years of reporting on many different diseases. In this case, hypertrophic cardiomyopathy and how we have changed the categories as we've learned new things over time. And over a five year period, we had changed categories 300 times with different variants and different knowledge that was acquired in both directions going from benign to pathogenic and vice versa. And we just published some of this data about this sort of arena this past month and about 4% of a physician's reports per year need to be updated. So it's a pretty significant challenge for us. We ended up developing a clinic interface to our laboratory software where the physician can go in, get access to their patient reports in electronic form and the variants are structured and connected to our variant database. So if we update, if I go in and update a variant maybe in the context of signing out a new report then that automatically will update the variant in the physician's system. And today they actually get an email alert without PHI but a link to that lands them on this page and tells them what the variant information that changed. And so they can see that information, they can click on the variant and go and read the evidence that was the basis for that change. And so this, I could just approve a variant and a thousand reports could get updated in seconds. And that makes it much more efficient to be able to do that and support patient, improve patient care. And this activity has been the subject of a NIH challenge grant that David Bates is the PI of to look at the usability of this system by physicians, particularly physicians who have never had any training in it and how easily can they just go in and figure it out and get the updates and know what's happened and grading them on certain tasks, et cetera. And overall, and I was a little concerned about this system in that in giving all this very efficient updating that the physicians would come back and say, great now can you go amend that report for me and sign it out again. And they haven't largely done that. They've been satisfied with this system delivering the updated information. They can print it out and put it in a patient chart if they want to. And we haven't really had to amend reports. So that's been a great system for us to support this process and improve patient care. Now, as we think about expanding these types of systems into genomic medicine where we're dealing with a whole genome, obviously we can't ping the physician every time one of those three million variants changes its knowledge. So there's got to be more infrastructure to support. Who gets what alerts or do you deliver alerts at all or do you allow just real time engagement of that data in clinical decision support paradigms? And so these are all things that we're thinking and enabling within the partners healthcare EMR environment and discussing how to do this best with what strategies. We've also been talking to the, so we disabled the alerting mechanism in the oncology domain because somatic tumors, those tumors, their genetics evolve and updating times years later is not useful. But they have asked us about using this same environment to deliver clinical trial notification, which is of course very common within the oncology domain. So we've been thinking about expanding this infrastructure to support those types of activities. When we originally developed this infrastructure, we built a network hub to enable many labs to communicate to many healthcare organizations so everybody didn't have to build interfaces to every lab because that's expensive architecture to allow knowledge sources and labs to share data amongst themselves but still maintain their own interpretations of data. We've talked with Illumina about bringing their clear whole genome lab onto this network and be able to robustly share data. And we are rolling out the infrastructure to support this this month where labs who agree to share their data can show the variant interpretations that they have on a variant specific level. You can click on lab X's variant, see what they say about that variant, how many cases they've had, what the literature is that supports that and labs can be able to sort of import other labs, interpretations into their system, enabling a much richer sharing of data. We also have enabled case history sharing. So if I have all of my cases I've ever seen in my system, I can allow that data to be shared with another lab and we strip all the PHI off of that so that if I go into my system, I can see my cases with all PHI in there but then I can see the identified cases with some clinical information and which variants were found in each case and be able to see that across all the datasets that are being shared in that environment. And so we hope that that will enable a richer understanding as particularly as we try to address the challenges of rare variation interpretation. This gene insight system is now being integrated into our EHR environment so that it will become, right now it's been a standalone web-based interface and that allows us to easily roll it out to any physician around the world who doesn't even have an EMR but we feel like it's more powerful if we can truly integrate it into the EHR environment so it's the same face that any physician logging into their patient electronic health record will see. So that by the end of this month will be integrated into the partners EHR, it'll be partners clinical genetic data repositories sort of what it's being called. The same infrastructure and structured data is being pushed into the research patient data registry so that our our PDR that we use today for a lot of clinical research will have structured genetic data within it and that's obviously supporting many millions of patients within the partners health care environment. There's another effort that Scott is leading in terms of biorepository, consenting thousands of patients that walk in the door of partners health care to share, to sign up for the biorepository and consenting to broad use of that data and their sample. And we would like to engage other groups in thinking about this data sharing environment for one of the perhaps one of the demonstration pilot projects to be able to use this infrastructure if labs or groups are interested in getting onto a network and sharing data. We'd be happy to talk with others who might be interested in that and coming together for one of those projects. And that's all I think some of the groups that have worked on some of the projects I spoke to and I'd be happy to entertain any questions. Mark. I think it's really excellent, the work that's been done to try and aggregate information. I'd have one comment and I want to pose a question to some of our payer representatives here. So the comment is that one of the benefits that has been touted for whole genome is the fact that you only need to do it once and then this information can be repurposed. I think you very eloquently stated that there is in fact work involved with that that could lead to cost and reimbursement issues down the road. And so perhaps we shouldn't be quite so forthcoming about the idea of saying this is a one-time cost and then it's basically free to use the information. I think that's going to generate some very interesting models about how do we actually reimburse for the updating and maintenance in that but probably beyond the scope of a five-minute discussion. The specific question that I would like to pose to our payer representatives is this issue of the labs that won't play. And I'm glad to hear that there's a very small number but clearly for this to work, I mean I think we have to have all people being willing to put their data in so we can all benefit from that. Is there a mechanism in the reimbursement side where if we set aside those that are sole source providers like BRCA where there's patent issues but in situations where laboratories say, well that's our information is proprietary, we're gonna use that to be competitive, the payer could say, guess what? We'll use somebody else that's gonna contribute data because in the long run that's best for our patients because it's only that economic pressure that will ultimately I think get them to pay. Is that a mechanism that would be possible to explore? Well that's a complicated question. You know a business relationship with the lab, especially a national lab is based on lots of factors, mostly economics, so I think you're not gonna impale yourself on the stake over that issue. Although the big national labs I notice are on the list there so it's the smaller players. You know, I think that you're talking about. And usually the relationships with the labs are over sort of quality and economics, not this level of data sharing so I think probably have to spend some time thinking about that. Now we do actually receive lab data from our labs, again in the molecular pathology codes or all these stack codes. Again when we get more specificity over those things we will be getting specific data in and be curious about what the lab core representative has to say about that. So it can be the payer that's sharing the data as a pass through. Yes, I'll be talking about reimbursement in my talk and some of the new changes to reimbursement and what impact they might have. One other thought on this and it's, particularly for the Medicare program there's tools other than directly reimbursement that might be explored as ways to encourage data sharing so for example there's all the conditions of participation that make providers eligible for reimbursement and there's all sorts of sort of regulatory policies throughout Medicare in addition to the sort of individual reimbursement for tests that might be explored as ways of encouraging that kind of behavior. But that's just Medicare, not necessarily private payers but something to look into and wasn't my area of expertise at Medicare so other folks would have to be asked about it. Certainly one could imagine the request that's made during a contract negotiation that this would be a sort of a standard that one was expected to follow. I'm curious about John Harley's Cincinnati Children's about you have 300 cases in which you've already done ex-home sequencing that's all done outside the institution. What level of data preservation are you using? Are you saving the original data, the terabytes, the five terabytes that you get back from each subject creating something that's a petabyte and a half that you have to store? Or are you storing just the final sequence and ignoring the possibility that someone would want to refilter that data from the beginning? And then how long can you plan to keep that available for re-querying and that sort of thing? So just to clarify, I think the 300 up there was the number of novel variant assessments per month that we haven't done that many ex-home sequences. But to address your question, which is a very good one in terms of data storage. So we've addressed this in the laboratory standards that we're developing for ACMG right now in terms of data storage for next gen sequencing. And the guideline that we put out is labs certainly don't need to keep the raw image files. And that we recommend that labs keep the read aligned files or the BAM files or the FASTQ files for a period of time suggesting between one and two years. And then they keep the VCF file, the variant call files indefinitely at this moment as a suggested guideline, although a lot of state guidelines and CLIA guidelines may trump certain things. So those are some basic guidance. We do recognize the ability to realign data with better algorithms and so wanting to keep the raw reads in a context and not just the aligned reads but the non-aligned reads as well so that they could be realigned. But to keep them indefinitely, I think the technologies for sequencing are changing such that the raw technology is gonna be improved such that two years later you'd probably still wanna re-sequence. And the cost to store data is large enough that you have to balance the cost to store with the cost to redo it. So there's a lot of things in flux here and I don't think we can easily say this is exactly how you should do it but we're making some guidance so that labs that want guidance can have that. But I still think it's a moving target. Mary Rowling, last question and short answer. I guess, okay. And Kelly, really short question and answer. And we're gonna have to, so I will claim credit or blame for the fact that there are 20 minute talks and five minute discussion periods. We're over, we're eating into our break time. That's fine. Cause this is a really important part of the discussion. So quick answer, quick questions, quick answers. Can you help distinguish for me? Cause I totally understand sharing these variant data is super important and we need to come up with a national or international resource. What's the distinction between sharing to gene insight versus sharing what you described in your U41 slash ClinVar? Yep. So it's our hope that, for instance, myself, I use gene insight but I'm gonna be sharing all of that data into ClinVar that's from my lab but there's different data structures and capabilities that surround those data sets. So for instance, to be able to submit full genome data sets into NCBI, they have to go into DBGAP and there's a lot of constraints around access to that data. In gene insight, which is a HIPAA secure environment because it's within our partner's IS system, we can allow patient information to be there and be shared a little more easily. So that's one distinction. Even though the variant level information, I do hope that any lab sharing it within the gene insight network that we're trying to set up would also then they would be willing to submit that into ClinVar. So I think it's, at the end of the day, it's multiple strategies hoping that we can succeed somewhere or in all of them. But my real goal is to try and get everything as much as possible into the public domain but the degree to which the full data structure that some of the more sophisticated IT systems can support, whether that's ready and there already, it's probably not there yet in ClinVar but hopefully over time we can continue to enhance these systems to really take the full breadth of data that's associated with cases and clinical data as well. Okay, so Kelly, Irwin, and then we're gonna go on. Kelly, use your microphone, please. Microphone, please. Push it on the button, yeah. You stated that about 66% of your variants had never been seen before. If you suspect that these are involved in the patient's condition, how do you return those information for inclusion in patient healthcare or do you? So we do return any variant found during targeted testing whether it's a variant of unknown significance and marked as such on the report. And the reason for that is we often try to encourage family member testing because segregation for hypertrophic cardiomyopathy and dominant disease is the best way to actually understand if we can get a LOD score that's significant that can guide us to calling it pathogenic. So we'll report those out and we'll often do free testing of family members to support that segregation aspect. So we basically report what we know about a variant and then over time we may learn from it and increase that knowledge. So I applaud you on the integration with the EMR and that raises some questions. One of them, who actually makes the decision which variant gets into the EMR? And then second, but equally important, the EMR for any given patient can be viewed by many people from students to nurses to dietitians to. So is there any way that you limit, control, regulate access to who is gonna actually see the genetic information within the EMR? Yeah, so to answer your first question, anything that we write on a clinical report would go into the EMR just like any other clinical report from any other system. As far as access to that data once it's in there, so we are actually constrained somewhat by Massachusetts state law that says that only the ordering physician can see data from a genetic test if done for screening purposes, but we've got around that by consenting the patients to allow the entire health care institution to see that data and there's a balance there between restricting access for very specific situations versus what we believe is the need to have broader access to really engage this information in the care of a patient. So we are leaning towards much more towards the side of everyone should have access to this data, but there's a long conversation there that gets into some more subtle things that we don't have quite enough time for. Getting in trouble. I have 25 more questions, but I'm going to take chairs prerogative and tell myself to go away. But so that was a really great introduction to the kinds of things that we want to talk about over the next day and a half. So thank you and the next speaker, Debra, is behind me.