 Thanks, Mark. Aaron and I have divided the work up so that I get to do very little work and she gets to do a lot of work. I'll be introducing the speakers and she'll be taking notes so that we can summarize at the end and we can send our notes on to Mark and Ken so that they can summarize for all of us. Our first speaker is Teresa Caban who is in the Office of the National Coordinator and in charge of personalized medicine as Chief Scientific Officer. So the title of her talk is Advancing Health IT to Support Genomic Medicine and Research. I'm going to put on a nine-minute timer, Teresa, and I will yell at you when it's nine minutes. I will gently say that it's nine minutes. Sound good. Thank you. Ben, as Ben said, I'm Teresa Caban and Chief Scientist at ONC. Next slide, please. And before we dig in, I actually wanted to put some of our work at ONC in a little bit of context because I know that we collaborate quite a bit with NIH, but realize that not all of you may be familiar with kind of our space. And this is an attempt to try to show the Venn diagrams within healthcare where we sit as an office within the office of the Secretary of the U.S. Department of Health and Human Services. We have two primary responsibilities with regards to health IT. One of them is that we coordinate policy efforts around health IT across the federal government and private sector stakeholders, but we also regulate health IT systems. So there was a lot of conversation yesterday about health information technology and how it may facilitate or hindered genomic medicine and research, and we have a role in that we certify health IT systems and their functionality. And most recently, through our 21st Century Cures Act rule, which was released last year, we have required that EHR systems use a standardized application programming interface as well as a standard set of data elements known as the U.S. core data for interoperability, both of which I think will be very important for research. Next slide, please. My division in particular focuses on putting health IT to work for research. While we don't fund research per se, we collaborate very closely with those who do NIH being one of the agencies we collaborate and coordinate with. We also lead demonstration projects that try to advance technology and policy issues to be able to advance the research agenda. And the 21st Century Cures Act really prioritized interoperability as does our regulation. And as I mentioned, it's pretty clerical to now that we have these health IT systems in place across the country, make them work not just for clinical care, but also for research and other use cases such as post-market surveillance and public health. And we have a role to play in advancing that. Next slide, please. And so what we have found through our demonstration projects and our coordination is just what a critical role health IT plays in the research enterprise. Being able to identify research participants, assemble cohorts, collect and share data as well as share results back with both participants' clinicians as well as caregivers and policymakers requires robust infrastructure to do so. And what we have been focused on through our projects under not only precision medicine, but also patient-centered outcomes research, our leap in health IT funding opportunity, and some more recent efforts to advance standards specifically is in finding a way to do so and collaboration with a lot of different stakeholders, researchers and research funders included. Next slide. As I took over as chief sciences, I wanted us to take a step back and much like what NHGRI did over the last year or so, think through what it is that is most critical to be able to leverage health IT and data from health IT systems in research. We led a multi-year stakeholder-driven effort to try to ascertain what those needs might be and articulated this in an agenda that was published early last year. We really are trying to foster an ecosystem where clinical care and research come closer together. And as you'll see on the next slide, the agenda has two sets of priorities. One focused on leveraging data and the other one on building the infrastructure or advancing the infrastructure that we need to do so. I'm not going to delve into these in detail today, but I encourage you to take a look. I'll say that a lot of these align with some of the priorities that NIH has broadly with regards to data science and informatics and we were fortunate to be able to collaborate with not just NHGRI, but the National Library of Medicine, the Office of the Director, all of us and NCI on pulling this together, as well as colleagues from the Drug Administration, the Centers for Disease Control and Prevention and the Veterans Health Administration and a lot of private sector stakeholders. So through a lit review, a workshop, many conference sessions, we gathered a lot of input on feedback on pulling this together and are starting to work on moving forward on some of these priorities. Next slide. I wanted to dive a little bit deeper, though, into some of the work that we've been doing under the Precision Medicine Initiative, which, as you may recall, was launched under the Obama Administration in 2015, was the goal to really move us to deliver care that is tailored to patients as individuals. And while genomics has a big role to play in that, we've been conscious from the start to focus on other things like environment and lifestyle and how that comes together with biology. When initially launched, the Precision Medicine Initiative included the National Institutes of Health and what became known as the All of Us Research Program, NCI, the Food and Drug Administration and ONC, the Administration recognized the importance of infrastructure issues and the criticality of that to be able to advance science and discovery. And so since then, we've had a whole host of partners from across the federal government, the Office for Civil Rights, the Department of Energy, the VA, DOD, NIST, and I'm sure I'm forgetting others. And so collaboratively, we've been advancing sort of the puzzle pieces to be able to make this a reality initially through discovery and sort of building the infrastructure to advance the science. Next slide. And so our role has really been consistent with what I said earlier, focusing on some of the policy and technology issues that need to be pertain to health IT and electronic health data that comes from these systems that are needed to be able to conduct precision medicine research as well as develop the delivered precision medicine care. And we've done that through a couple of, we've collaborated on some policy pieces, developing privacy and security principles and guidelines, as well as testing those through the SynchroScience pilot project and advancing the use of standards to promote data sharing, as well as sharing specific data types that are needed for precision medicine. So in addition to collaborating with NIH, Harvard and leading developers and SynchroScience, which is meant to leverage an open API to enable patients to share their data from their doctor's EHR systems and research, we've also been advancing standards for social determinants of health and mobiles and wearables. And then as you'll see on the next slide, we've taken a face approach and collaboration with colleagues who are joining us today and joined us yesterday to advance the sharing of genomic information in a standardized way. So SynchroGEME is an ongoing project and we've been very proud to have been leading this in collaboration with many different stakeholders. We've been leveraging the FHIR standard, which is now required under regulation and testing it, sort of seeing in real world how easy or difficult it is to advance genomic data and share it in a standardized way. So initially we kind of tested out the specification and built it out. In phase two, we looked at whether settings could actually leverage the standard and specification to exchange data. And phase three, which concluded last year we started to engage laboratories. And phase four, which just launched and we spent identifying sites to do the testing, we'll be focusing on sharing the data with patients and caregivers. For all of these, the one thing I wanted to emphasize is not only the importance of engaging organizations that are going to be leveraging these standards and collaborating with them, but also the standard development organizations to make sure that the results get integrated into sort of the standard development life cycle and support broader adoption. We've been lucky to collaborate with Dr. Freeman who's joined us today in this effort and it's really helped us for collaboration. Next slide, please. And so before I conclude, I wanted to sort of throw out questions to help foster conversation and help NHGRI. I think as the institute considers developing this informatics agenda, it's important to be crystal clear in terms of what's within NHGRI's sphere of control. Is it the research piece? Is it the informatics for research? Is it broader than that? Is it clinical care? And then as we're thinking of informatics, informatics for who and for what purpose, right? Who is to benefit? Is it the researchers, the clinicians, patients, participants? And what are the specific data and informatics need? I think we're having a unique opportunity to try to think through what those needs are and build systems to support those instead of developing systems and trying to make sure that they fit a need that we have conceived. And then how to best engage other stakeholders? And for example, what role can ONC play to help advance informatic needs that are identified today and that were identified yesterday? And how can NHGRI engage stakeholders better? And so with that on the next slide, I included a few resources and links that I think will be a benefit. The ONC cures that final rule. The link to USCDI. I want to point out this was released as a draft last year. The draft version two was released last month and it will be shared with ONC's advisory committee, the health IT advisory committee or HIPAC through a public comment process. And so I really want to encourage the community to engage in this process more fully because as you identify needs for data elements that could be added or incorporated into the USCDI, there is an opportunity for you to provide public comments as the standard gets updated on an annual basis. And with that I'll conclude. Thank you. Well, you're welcome. Excellent talk Teresa. There is one question that came through the chat box. I don't see anything genomics related in any stage of USCDI. Is it proposed in USCDI somewhere? Not that I've seen now. Right now the data classes and data elements are for those that are currently within the EHR. So I think is that something that this community would like to see move forward in either version two or version three, which will open for public comment later this year, there's an opportunity to do so. Part of the issue with what gets incorporated to the USCDI is the maturity level of a given standard and they're being sort of a broad need for a particular use case. Right? Because USCDI gets incorporated as a requirement of developers. So O&C tries to make sure that both the need as well as the technology and the standard are at a point where it makes sense to put that requirement for. So you got a real quick absolutely in terms of incorporating then perhaps a group like Sync for Genes Project can propose it. Well, the Sync for Genes Project is an O&C project so it'd be a little awkward for us to propose a data class or data element to ourselves, but certainly we create a community and so folks who participate are more than welcome to suggest that to the public comment process. And I believe NIH is also setting up an internal to NIH process by which the Institutes and Centers will provide formal comments to O&C on a go forward basis. Great. I don't see any other questions coming in. I have just one question. I was wondering if you could highlight or discuss potentially the importance in terms of consuming electronic healthcare record data for research, common data models and the challenge that institutions face with kind of transforming their EHR data into a common data model. Certainly I think that's an issue that we recognize at O&C and our colleagues at NIH. I had an opportunity to work at NIH under a detail for a year and was coordinating efforts to advance the use of fire and research and something that we're looking at very closely is how the fire specification aligns or it doesn't with common data models used in research and what tools might be needed to be able to enable consumption of the data and maybe some transformations to facilitate that. Fantastic. That would be great. Is there any more questions? Give a second here to see if people are typing. Does your brief include engaging patients and engaging individuals who actually might be the recipients of genomic information and how that should be delivered, whether it's sort of through their smartphones into access to their electronic records? How do you not only provide access but provide sort of an interpretive commentary along with the access to data? It's hard to think of data just being dumped on the participant. Certainly, and to sort of take a little step back with regards to your question, I think that individuals should have access to their clinical data, period, but the value to be derived is in what you do with the data. Yeah. What you do with the data and how you leverage that information with regards to think for gene specifically, that's something we are looking at now and certainly through prior phases, what we found is that we need additional resources for patients and caregivers as well as clinicians when genomic information are shared. If there are no other questions, I think we'll we'll move on. The next speaker is Subha Madhavan who will talk about genomics-based clinical informatics resources to support precision oncology and evidence curation. Subha is the chief data scientist at Georgetown and the founding director of the Georgetown Innovation Center for Biomedical Informatics. She's done an extensive amount of work, particularly in the oncology space with Georgetown and with MedStar, their health care partner. Take it away, Subha. Thank you so much, Dan. Can everyone hear me okay? Perfect. Mark and Ken, thank you so much for inviting me to present as part of this wonderful workshop. I really enjoyed yesterday's presentation as well. As Dan said, I will be talking about genomics-based clinical informatics resources to support precision oncology and evidence curation. So these are my quick disclosures. Let's get you to look at it for one second. So in the next few minutes, I wanted to share with you all two stories. The first story is around a virtual tumor board and here the focus is really on facilitators and barriers to deploying genomic-based clinical informatics resources in the clinic. In cancer care. And then the second story I wanted to talk to you is much more sort of on the informatics research side is in terms of identifying research needs and how do we sort of deploy genomic-based clinical decision support in the clinic and what research needs to happen in order to make those clinical decision support systems work really well. So the first case study is about the two-year-old male. He was diagnosed with stage IV pancreatic adenocarcinoma. He was being treated at the Lombardi Conference of Cancer Center at Georgetown University. He was progressing very aggressively. His disease was progressing very aggressively after standard chemotherapy, which was, you know, typically GM side of being naprackly taxol, which is given to patients with pancreatic adenocarcinoma. And then the team conducted a multi-panel genomic testing and sort of, you know, thank you for telling me it's cancer. It came up with Keras and P2P3, et cetera, but a deeper clinical informatic analysis revealed a mutation in a VEGF receptor, which resulted in selection of a clinical trial that really extended this gentleman's life by a matter of few months. So, you know, people who know pancreatic adenocarcinoma know that the probability of survival on average is about 12 months and any improvement to that is considered a big benefit. So, in this case, the patient had a VEGF receptor mutation and the team identified a ligand-independent constitutive phosphorylation, which led us to select a TKI, a very specific tyrosine kinase inhibitor that targets that particular pathway. And he was able to be enrolled onto an NIH-sponsored Serapenib trial, and this was an email that our patient coordinators received from him, and he was almost ending up in hospice, but without this genomic analysis and the clinical informatics behind it, he would not have benefited from this clinical trial. So, the question was, how do we scale this to, you know, 300 hospitals across the country and over 1500 patients? These were all patients that were part of the Know Your Tumor program within the pancreatic cancer advocacy network. It's called the PAN-PAN network, and they work with centers across the country, and the question for the clinical informatics team was how do we scale this experience that I just showed you to all these patients who come to PAN-PAN. And the solution that the team developed was an artificial intelligence-based virtual molecular tumor board, and you can, I have these references here right here below on my slides. You can read about them in more detail, but very briefly, the AI-based virtual tumor board involved development of a virtual knowledge base and, you know, collection of molecular profiling data from patients as well as clinical information, their clinical history, other comorbidities and demographic data, et cetera, from the EHR, combining that information and presenting it to a medical review panel, and the virtual tumor board facilitated the workflow of this medical review panel, which then finally developed these individualized reports for each patient, and then the patients who are later on followed by a real-world outcome analysis over a period of six months, and that data got incorporated back into the knowledge base to help other patients who came into the PAN-PAN know your tumor program. So here is very quickly the information engineering that was used for the know your tumor program, so everything that you see in yellow is publicly available information, so this is your contrials.gov or published literature, guidelines from NCCN or the FDA, and published data about traumatic variants and the evidence associated with associating those variants to various target therapies and so on, and what you see in blue is the information that was selected from the patient, so the molecular profile and the demographic data, et cetera, and this therapeutic engine integrated the public information with the patient-specific information and analyzed a number of rules for treatment implications and the recommendations to the medical review panel, which was interviewed by humans, the doctors, and entered into this particular report. So now the group has started to think about, you know, how could we incorporate this concept of virtual tumor board in every medical center? I mean, we know that cancer care, majority of cancer care actually occurs in community hospitals and not all of them go to academic tertiary care facilities. And so the question is how can we leverage the expertise that's present in these academic medical centers or other tertiary care facilities in specific clinical genomics areas and provide that precision care to patients who go to community hospitals. And so this is a workflow, which was a concept that we wrote up in JCO CCI last year, but, you know, slowly components of this are beginning to be implemented within community centers. So for example, I'm part of the MedStar Health clinical system, which is 10 hospitals and 240 care facilities in the greater Washington DC area. We are one academic medical center, which is a Georgetown University hospital, which you see behind me, that's the Healy Hall on the Georgetown campus, but rest of the hospitals are all mostly community based in Washington DC and in Maryland. And the question is how do we take what we've learned at Georgetown and implemented writ large at these community hospitals. And here is where our work, which is funded by NHGRI, comes into play. So the ClinGen and GA4GH big communities, big stands for variant interpretation cancer consortium. These are communities of experts who come together to create these evidence items for the somatic variants and associating them to various therapies and also assigning these variants into different tiers. This is a tier one variant. So there is a direct association to therapies at a tier two variant or a tier three variant and so on. And then that information can then be made available to the treating clinician who's taking care of the patient at this community hospital. So what are some of the barriers? So we, this project has been ongoing for almost four years now. It started out with Know Your Tumor program and really expanding out to other centers both in the United States and across the world. And some of the major bottlenecks that we see are, you know, there's a lot of manual curation. So as you can imagine in the information engineering slide that I showed you, there's a ton of manual curation and data integration. So we need armies of people who are reading literature, who are, you know, pulling data from databases and integrating them into this knowledge base. And there's a clear need for a consensus in variant interpretation. And I don't think this is news to anyone here. I mean, there is all kinds of databases for variant interpretation. People have specific expertise in different domain areas. So there's a clear need for evidence adjudication. And, you know, the machine can only go so far. So I kind of say that, you know, it can get us to 80% of the way. But, you know, the real human input is needed when clinical interpretation is required to treat a patient. Too many knowledge bases. This is extremely onerous for clinicians and Phase 1 clinical trialists who have to adjudicate all this information. And it's very onerous on them to be mining these multiple knowledge bases when they have to develop a treatment plan or think about which clinical trial might be best suited for this patient. Lack of resources for curators who are creating these one-off solutions. We have a number of curators that are part of the ClinGen Consortium, for example, who are curating evidence for pathogenic variants and assigning the different evidence tiers. And, you know, there's a lack of easy-to-use user-friendly interfaces for them to do their job. And that is definitely a rate-limiting factor in constructing this knowledge base to support clinical decision-making. Need for more uptake to clinical researchers. Eric yesterday talked about the need for diversification and training. And clearly, training is needed with... This is your one-minute warning. Oh, one minute. Okay. Thank you. Need to update uptake within the clinical researchers. So it takes an army to actually get this work done. So this is face-to-face meetings at our ClinGen Consortia meetings where we come together and really create these knowledge bases for the evidence that I talked about. So here is our first publication. We call it MDLD Minimal Variant Level Data. This is a standardized format for representing evidence for somatic variants. And there was a very big uptake of this as soon as it got published. So I think I misunderstood the amount of time that was allocated. So I think what I'm going to do is skip the second story and do one picture. And then we will stop there. So in terms of what is the research focus? So where do we really need to put our focus to really enable this clinical decision-making? And here is an example of a proof of concept that we're developing at Georgetown. The font size is small, but I'll just sort of high-level describe this is that this is an NLP pipeline, a natural language processing pipeline that uses this concept of learning from less data because we don't have millions of records to learn from in most cases. So when you are looking at one particular disease, for example, pancreatic adenocarcinomas, very specifically vegetative receptor, you don't have that many articles to learn from. So the question is, can we learn from less data? We all talk about learning from big data, but can we learn from less data? And there are some very interesting research that's coming out in learning from less data to really enhance clinical decision-making. So you can learn more about that at this website that I've shown here. So just wanted to say thank you to all the folks that contributed to this, and this is a team that worked on the Mace2KSL as the virtual tumor project. Thank you very much. Fantastic talk. I'm always in awe of your work. You have one, several questions. Mark actually has a three-part question. Do you want to ask this directly? Would that be easier? Well actually, Subha addressed it right after I hit submit in the next slide. But I think the question that remains would be you referenced the manual processes both in the patient data and in the literature review. And I think you have referenced a little bit about learning from less data, but what would you highlight as being the key areas for research to try and reduce reliance on these manual processes? Yeah, that's a great question Mark. So some of the things that we are looking into is the concept of semantic abstraction. So there may be a lot of associations that are present in these knowledge bases or in literature, but we need common data models to understand that two sentences may be talking about the same thing. So for example, the association of a vegetative receptor with a pancreatic adenocarcinoma may be stated somewhat differently in one knowledge base versus another. So are there common frameworks for the semantic abstraction that we could learn from so that when we feed this into the machine, the machine understands that two different sentences that talk about the exact same association in two languages mean the same thing. So I think, you know, how do we sort of standardize this interpretation of variance is one big area that we are focused on and a number of people and I see colleagues from GFOGH, big participating here as well. We're all kind of thinking about how do we standardize this framework for evidence adjudication and standardization of this evidence. Fantastic. And there's another question. Does the DMTB have the opportunity to capture outcomes to the recommendations made such that the community learns and optimizes recommendations? Yeah, that's a great question. So one of the things that the Know Your Tumor Program within PanCAN is attempting to do is, you know, after the reports are delivered for a period of six months, they would collect data back from either the care team or the family. So essentially there is a little app that was used for the patients to enter data. One thing to keep in mind is, you know, patients with advanced metastatic cancers are very sick patients, so it's very hard to collect outcomes from these patients directly. And so the app was essentially made available either to the care team or their families for them to enter information about these patients. And that gets incorporated back into the knowledge base to support. I mean, we had about 30 to 40% completion rate for the forms of the surveys that were sent out. So we don't have outcomes on all patients, but we do have a lot of information about the data that was sent out to the community. And whatever outcomes came in that goes back into the learning process, into the machine learning process for the future iteration of the knowledge base. Fantastic. I think we'll take just one more question. How parallel are the data models between germline and somatic variant curation? Yeah, so that again is a fantastic question. So we are doing a group and majority of the work within NHDRI supported ClinGen is all germline focused. So what we're doing is we're taking the frameworks, the ACMG frameworks and the actionability frameworks that have been developed over the years for germline and making modifications to suit the somatic interpretation. It doesn't fit. There's no one-to-one mapping, of course, but there's a lot to be learned from the germline interpretation for us to make the application. And why is this important? This is important because in the cancer world, we're not dealing just with the somatic mutations. We're also dealing with germline and there's constant germline filtering that needs to happen even when you're looking at tumor biopsies. So if the frameworks are matching, it's easier for us to not only look at the somatic variants, but also in the context of the germline background, which is then represented in the same framework. Fantastic. Fantastic talk. Great discussion. Looking forward to the last speaker. So with that, I think Dan we should transition and introduce our last speaker and then we'll open it up for a broad discussion with everybody. Okay. So our last speaker is Nephi Walton who is going to speak to us about the integration of genomic data into electronic health records current state and future directions. So Nephi is currently he had led several genomics and informatics initiatives at Geisinger and Intermountain and he's going to talk to us about heredigine with plans to sequence half a million patients. All right. So my talk is actually going to be more about integration of genomic data into the health record based on work I've done at Geisinger now with Intermountain and heredigine. As my disclosures I do serve as Associate Medical Director of Intermountain's Precision Genomic Sequencing Laboratory that does offer commercial sequencing services. I do not have any affiliation with or financial interest in any EHR vendors. In fact, there are times when they would pay me not to speak. You can Mark will attest to that opinions and viewpoints are my own and do not represent those of any EHR vendor. So anything that I talk about from an EHR about an EHR vendor, that's not necessarily the perspective and not necessarily the stance of the EHR vendor. As a historical background I started an institution that used all scripts, moved to an institution that used Epic and implemented genomics and I've now moved to an institution that uses CERNR and I'm implementing genomics again. The vast majority of genetic data remains in PDFs today. So at Geisinger we completed a genomic data integration pilot. We imported about 1200 patients genetics data including tier 1 variants and pharmacogenomic variants and we built clinical decision support and patient provider facing information for all of those variants. We learned a lot in the process. There are a number of challenges that we faced and that I think that more research is needed around and I'm going to discuss those quickly. I won't be able to go over all of them so I'll put the reference to the paper at the end because there's a lot to go over. So one of the first challenges we hit was the difference in the needs of clinical geneticists versus primary care providers. As a clinical geneticist, myself I was looking for a lot of things that weren't there in our initial implementation but the implementation was suited pretty well for population genomics which was suited well for Emerge and the MyCode program. What we need to do in this area is define separate but compatible workflows for different genomics use cases. So if we look at the HR vendors and there are different approaches they take it's really interesting. So Epic has a system that's in production that is really geared towards population health. Cerner has now developed a system that's in the alpha phase that I've seen that is built by clinical genetics and therefore is very geared towards clinical geneticists. Those are very different workflows and I think it's important that both of them are developing out in different ways and actually could learn a lot from each other. So it's really interesting that they took three different approaches which are all valuable and frankly all important. And if I could say one thing that was really needed in this area would be implementation science and looking at different ways to implement genomics for different providers including primary care, clinical genetics, and all of the other things that we need to do in this area. So it's really interesting that they took three different approaches which are primary care, clinical genetics, and other subspecialists. The major challenge that I think the EHR companies are looking at that think it's a major barrier is the interface between the laboratory and electronic health record. Right now most systems require that you manually enter genetic tests into the reports into the system and that's not a viable solution. Someone on my team suggested hiring genetic secretaries since we have genetic counselors and genetic counseling assistants. This is a bad idea. I think he was joking. What we really need to do is work on interfaces and standards. So again here we have three different vendors and we have two different standards in this case. Epic uses HL7 B25 which is kind of a legacy standard. Cerner is using fire and all scripts is also using fire. And one of the arguments is that it doesn't matter what you use because laboratories don't support either and actually laboratories were saying the same thing. But what we want to do is we want to use these standards that all our vendors don't use either. I think this is starting to change and I'm hoping that everybody moves towards fire and that will overcome a major barrier but even if we use those standards there's another thing that's a major barrier to implementing this in clinical practice. And that is the standard definition of genetic phenotypes. So when you get a genetic test report you get all kinds of names for example, if you have a BRCA1 result or a BRCA2 result, they could be named different ways. And so that comes across as a genetic phenotype. What we need to look at genetic phenotypes as is a point of decision support. So you're saying based on this genetic phenotype, these are the decision support you're gonna offer. This is the information you're gonna deliver to the patient. Now that can be different by the gene or the variant. And so having some type of ontology or model built for a standard for genetic phenotypes is really important. So we built this out for one laboratory, but most institutions actually have at least five or six different laboratories that are returning genetic results that are all using different points, genetic phenotypes. So we need to rethink this area and what the point of decision support is for different diseases. Challenge three is the maintenance of clinical information and clinical decision support. So this is a huge area of need. There's not a lot of publicly available resources. And one of the big problems is that currently maintenance of these types of system requires technical skills. You actually have to bring in the EHR team to help develop, to maintain these resources instead of people who are experts in genetics or experts in pharmacogenomics. And as many of you know, those resources from the EHR development team are very rare and hard to get. So one of the ways that we can help this is to build open APIs where you can have people build external applications that where it's easier for people to maintain and curate those things. Challenge four is classification and reclassification of variants. So this is a huge problem from a number of different perspectives. So one of the major challenges here is that there's a lack of communication between the lab and the clinician. So the reason why this is important is often a laboratory will give a report that says this is a variant of unknown significance. So the clinician, in light of having the patient in front of them can often make a determination that yes, this isn't actually a pathogenic variant. Well, what happens then is they might reclassify it and say the patient has a disorder. That information never goes back to the laboratory and it's actually possible that they could have two discordant classifications in ClinVar for the exact same patient. Likewise, there's a potential of discordant results between a PDF and structured data in the electronic health record. It's a PDF comes from the lab and then it gets reclassified by someone. So a clinician and put into the HR, you'd have discordant results to which can be confusing. And further that gets complicated as you have patients that have the same variant and you might reclassify the variant based on a patient presentation and you'll have two or three other patients in the system that will have a different classification which opens you up to all kinds of challenges around liability and treating patients in different ways. Another area that I wanted to touch on was the moving beyond the initial implementation and looking at the real time use of genetic data. So this is something that we have a grant submitted around and also something I'm working on is an initiative at Inner Mountain where I'm trying to really interface the laboratory into the clinical space where I have that opportunity. And some of the challenges that we face in this, one that has been touched on before, excuse me, is accurately capturing phenotype data in a structured manner. Because to be able to do this, you have to have a lot of computable data and if it's not structured, it's not computable. There are also a lot of challenges of using genomic data with machine learning and artificial intelligence. This is something as a chair of the genomics and translational bioinformatics group at AMIA that I'm working on, working on a paper to discuss this. Genomic data actually introduces, there's a lot of challenges. First of all, you have this huge data space that you're working in. And so machine learning, I mean, in ways that it's good at using where there's a lot of data. But the problem is we don't have a lot of knowledge surrounding that data. Another thing that was touched on earlier is the potential racial and ethnic biases. I mean, I think genetic data is the most probably dangerous thing in that area because there are a lot of genetic similarities between people in different racial ethnic groups and you'll actually be making decisions based on that genetic data. And that has an inherent bias and to be careful of that is used in an equitable and ethical way. The other is the regulatory nightmare surrounding the use of entire sequence. So, we have all of this huge trough of information about a patient that, what drugs are gonna react to, what we can use real-time genomics in their diagnosis. We can use it to essentially personalize care. But the problem with that is that everything right now is based on a clear report that comes from a laboratory and in order to use that other data, you have to go back through that process. So, we need to look at some ways to streamline the use of genetic data outside of that clinical report. And I actually went through that faster than I thought I would. I don't know how much time I've left. I kept going slightly over and when I timed myself, I must have beat my record. So anyway, I have a reference at the bottom about the paper we did regarding our implementation at Geisinger and I'll open it up to questions. We have a question regarding the genetic phenotypes. Is it possible to define a genetic phenotype from a variant alone? Or does this require information about the patient's phenotype? If the latter, do labs have enough information to make a call? No, I mean, I think that we need to start. So, when we look at genetic phenotype, this is, and this is the point I'm trying to make, is that this is a point of clinical decision support. So, this is a point where you're saying, based on this genetic phenotype, we're going to care for this patient differently, right? And so, for the most part, a patient with a BRCA1 variant, they're gonna be treated pretty similarly. Now, for some disorders, there are modifier variants that will modify their risk. So, they might not require certain types of surveillance or they might require more surveillance, depending on what that risk is. And so, it is something that needs to be really well-defined and curated, in my opinion, because there's, for any given disease, there's thousands of potential variants. And so, we need to say, through research, this variant confers a different risk, and we need to be able to have some sort of database or ontology to go to you to say, okay, this needs to be a separate genetic phenotype. Does that make sense? Does to me. Last question for you, and then I think we'll open up the questions to all the panelists after this one. What would you advise as the most valuable next steps to address the regulatory nightmare you referenced? I think to tease out pathways for use of clinical genetic data and potentially even a risk analysis, I guess, and maybe just identify, here's how we'd like to use it, and here are the barriers. I haven't thought of good ways around the clear report yet. I mean, if you open it up and you have no regulation, there's a lot of misuse, and I think that's why FDA has taken on pharmacogenomics, but then I would argue, if the government is going to regulate it, they need to provide some really clear guidelines and even offer that database that you can access to say, this is what you can and cannot report and the extent to which you can go, because even in pharmacogenomics, I used to think, that's an area we've got really well defined, we've got good clinical decision support guidelines, you've got good information, but even in that area now that I've become, it's the associate medical director of the laboratory and we're doing pharmacogenomic testing, you realize it's still not clear, and even FDA regulation and how all that's going, that's still not clear. There's a lot of gray area and there's a lot of things that need to be clear there. So I think at this point in time, we will open this up for all the panelists to answer questions and to partake. I wanna go back to one of the questions that was asked earlier. Is there initiatives similar to CPIC in the cancer world? Is anyone working on updatable peer reviewed guidelines that match somatic cancer variation with anti-cancer drugs? And that's for all panelists, that's for all three of you. Yeah, and I can start and maybe others can join. So there are a number of groups that are looking at sort of associating cancer variants to various therapies, right? So my cancer genome is one at Vanderbilt that are a number of such organizational databases. There's OncoKB, which is led out of Cornell, and the big organization that I mentioned briefly, the Variant Information for Cancer Consortium, which is actually a driver project within GF or GH, the Genomic Alliance for Genomics Research, Global Alliance for Genomics Research, is a driver project where a number of these databases are actually being integrated. So the VIC group is creating standardized data models and APIs that connect across these databases. The barrier that I mentioned is that each organization that's doing this curation kind of focuses on their areas of expertise, right? So in our case at Lombardi, we primarily focus on breast and GI cancers. So all of the curation that we do is pretty much focused on those and you kind of see those silos. And hopefully the APIs and the standardized models that VIC is developing can interconnect these and we have a good representation across these different domain areas. And I'm curious, yesterday we heard a lot about issues with algorithms lacking the ability to handle diversity and biases and so forth. So for your project, Know Your Tumor, what's the distribution of the different ethnicities represented within that algorithm and that AI machine learning platform that you mentioned and what did you guys do to ensure that those type of biases were not inherent within this platform that you've established? Yeah, that's a great question. I have to go back and look at the numbers but I think it was sort of just the national representation of pancreatic cancers in the United States that those were the percentages. The beauty of the study because it was a, it was all comers. So there wasn't a design that kind of made sure that there was a balance across different ethnicities. So it's all comers, patients with advanced metastatic cancers who needed a precision based approach for therapy but there was a good distribution as far as I recall. I have to go back to see the exact numbers but I think after the fact, after the information was collected, we did some subset analysis to see that, were there more actionable markers that benefited patients with certain ethnicities and race groups, et cetera, which now is being fed back into the knowledge base to support other people who are coming in into the KYT study. Fantastic. We've had several questions come in and I'm gonna just try to summarize and some of the discussion points. I hear a common thread here and I'm gonna call it a common data model thread although we're not using that word often at all during this session. But to me, when I think about, you know, the first speaker Teresa and your work and some of the issues that we're hearing from Nephi, I mean, to me this sounds like a common data model and we're lacking common data models. We're lacking the ability to structure our data and informant that can be shared across institutions or within an institution to drive research. We need, in order to drive research, we need to get the data out of the EHR into a standardized format. So I'm just wondering across all the speakers in this session, you know, again, we didn't talk much about common data models but what's the role of common data models? How are we gonna solve this problem of creating this sort of structured system and to be able to handle this type of work to drive research? So I think that we do need, and there are a lot of data models and standards out there and some of them do a pretty good job. And I think a lot of it is gonna be people working together. And part of the challenge is this is only recently really entered the clinical space. It was not long ago that everybody had only a PDF in terms of genetic data. So it's only recently come in structured format within the EHR. So it's young and I think that it will mature. I think the biggest challenge is to get everybody to adopt the same data model and agree on what a genetic phenotype is. I think that's a really challenging concept for people to understand and get a handle on. But it really needs to be, I mean, the reason that you would have a different genetic phenotype is because you're gonna make it, in my opinion is you're gonna make a difference in care. And now for research that's different because I mean, one of the things we also need to look at is where we need to find out if there's differences in care, right? So we need to start looking at people with different variants or variants in certain domains, variants in certain genes to see if there are different ways that we should care for them. That's one of the reasons I'm a big proponent of more aggressive genetic testing. A lot of people say, well, we won't change care. Well, we'll never know if we're gonna change care unless we start looking at these cohorts of patients with these changes to see if it will. So I don't know that there's a great answer to that. I think that things are improving over time. I think the standards through HL7 have changed dramatically and they're much better than they were even a short time ago. So I think things are getting better. And to build on that face comments, I did wanna clarify that once USCDI is implemented across Health IT systems, researchers will at least be able to bank on the data classes and elements required under USCDI to be standardized, right? Those are underpinned by controllable categories and those are specified in the standards. And so as it evolves and there's an opportunity for it to either add layers of specificity or add data classes or data elements that are needed broadly for clinical care and that would also benefit research. Yeah, and with that said, I mean, that part of it, there's a difference between a data model too and the knowledge or content that we put into it. And that's an area that I think needs a lot of development too to have sort of public domain resources for agreed upon knowledge. And I know this is an evolving space, but somewhere there needs to be kind of a consensus, especially if there's gonna be attempts to regulate it. Just one quickly to add to what Nephi and Teresa said, while it's important to have these standardized data models, it's also important to have apps or applications that use them, right? I mean, that's how the end users and the clinical decision support actually can happen. So that's one point. And the second one is, I think Nephi is trying to say this, is that the data models need to serve a lot of the data. I mean, I think the Holy Grail for Machine Learning is really labeled data. And unless these data models serve a lot of information or data, they end up not being used. So those two would be critical for the successful data model to kind of emerge as the one that people uptake and use. The one rule them all. We have a question actually, oh, I'm sorry, Mark, we have a hand up and an attendee, but do you wanna go first, Mark? Yeah, the challenge, of course, with this discussion is always to try and contextualize it for the purpose of the meeting, which is we all recognize the need for data model standards, infrastructure to support both clinical care and research in this space. But the question for this meeting, as I see it as, how can this be framed into a research question or a research agenda? In other words, how can we fund research to move us forward and what would be the metric? So I'd like to hear some comments from the speakers on that. I think, Mark, there's a lot of implementation science here because I think that's how you discover and find out what's needed. I think my view of what we needed and how to move forward was a lot different before we started the implementation at Geisinger than it was afterwards. And the only way that you find out what's needed is to actually do it, to do pilots, to do research studies. And so I think there needs to be a big focus on implementation science so that we can discover what's needed and run into the challenges. Otherwise, we're never gonna see them. So putting on my human factors, engineering hat, even a step before that is really identifying the user base and their needs, right? So when you start to pilot something, you have a tool, a system, whatever it is, an artifact to pilot test. But an HGRI needs to think about what the institute sees as its stakeholder group, who it's trying to serve, whether it's the research community, the genomic research community specifically, or the participants are broader than that and start thinking about what identifying the needs of those communities to inform an agenda around developing and pilot testing some of these tools. And then related to that, we've talked about standards and common data models. Part of the issue we ran into during things for genes were some issues on lack of semantic consistency or clarity. So understanding where there's consensus and where there isn't and where additional research might be needed or work might be needed to arrive at some of that consensus to have informed next iteration of standards. I think, Mark, if I have to bet on something in the clinical informatics realm for genomics, I would say it's explainable AI. Every time we build these machine learning models or artificial intelligence models and present it to clinicians or end users, the first question is what are the features that led to this precision or this recall or this F1 score or this sensitivity or specificity? I think the better we as a community are able to explain the features that are actually resulting in certain predictions. I think that there is critical research needed in that space. How do we create these explainable AIs which are not black box deep learning models that end users can understand and use within their workflows? No, I agree with that. I totally agree with that. And I think the better that AI becomes, the more dangerous it becomes because we start relying on it more without using our own brains. And we have to remember that AI is based on what it's seen. And when you run into situations that the machine hasn't seen, which happens often in medicine and you trust it and you run into problems. Great, I'd really like to transition a little bit because we have an attendee who's had their hand up for quite some time that would like to ask a question. So Suzanne, thank you for being so patient. If you'd like to take yourself off of mute because I can't actually see your question. I just see that you have your hand raised. Would you like to ask your question now? Yes, thanks. I'm sorry, I'm just joining you today. And Mike, the question's in a little bit different area, but I think it was opened up a bit by some of Teresa's comments for the fourth stage of Sync for Genes and also a comment that Dan made as well. And I'm particularly interested in more of the consumer facing angles and particularly in thinking about returning reports to patients, research, participants, and caregivers and certainly individual institutions such as Geissinger do a good job on this, both in Emerge and in all of us. There are experts pulling together to create, I would say, attempt to create culturally congruent health literate messages as well as visualizations around related to risks and other aspects of these kind of results. But as a kind of a research agenda, I think we really need to be moving more towards a computable, like visualization and message toolbox that would allow a variety of sites to be able to use to use a component, you know, specific infographic components and to build some tailored messages. So I just want, it's fine if there's any reactions to that kind of notion and where it fits within a clinical informatics research agenda. Is that? I would just add one thing to that as well for the panelists to think about is, should we have some of our community members attend these types of forums and have them represented and do presentations on the consumer end? So we make sure that their viewpoints are incorporated in our research agendas. So with that, I'll open up Suzanne's question and those follow up comments to the panelists. So this is an area of a lot of interest to me. So we deliver results to patients through the heredity and project. And then I also was from from Geisinger where we were delivering it through, you know, through the patient portal. And that is, so that's one of the major problems. If you're an institution and you're trying to implement genomics, one of the first things they're going to ask is who's going to own this and who's going to curate this? And most institutions don't have the resources or the people who can do that. And especially as you expand it to, you know, hundreds or thousands of genes. And that's where, you know, I had the point in my slide about that there are not a lot of public resources. And I think it would be great to have databases that have that information for patients and frankly providers. Providers are also not real well informed. There's a lot of primary care providers that manage some of these patients, particularly if they're not close to one of the main medical centers. And so having some type of resource built that stores and provides that information for hospitals would be great. And I think that'd be a great area to focus on. I mean, the cancer world has been doing this for many years, right? So the patient advocates are constantly part of many research groups. It's not on a regular basis, at least they are quarterly meetings, et cetera, where we meet with patient advocates. One thing that the NOAA tumor program did was while there was the 65 page clinical genomic report that was delivered to physicians, there was a three page summary that was presented to patients as well. Sort of a, you know, a sixth grade reading level, layperson language was incorporated. I'm seeing some new innovation that's coming out in the virtual tumor board space where, you know, now some of these apps allow for patients to be part of the discussion, at least for some portion of it. And I think there are pros and cons to that. But some apps are starting to think about, you know, if my case is being discussed, it should the patient be part of that discussion when the surgeon and the oncologist in the Malta Spring team is talking about. I think there are lots of other legal and liability issues that need to be considered. But I think we can definitely move in the direction of incorporating the patient more in the CVS. And so then based on, go ahead. Go ahead. Sorry, Erin. So we actually have patient representatives and the technical expert panel for thanks for Jean's space for currently and we sought out individuals who truly represented patient views and particularly within the genomic space given the nature of the project. And it's something that we're doing more systematically across our work. We have other projects that are very much patient focused developing platforms for apps and tools to share information with them. And those are going out into communities and actually pretty diverse communities to ensure that we build tools that work for a lot of people and not just a select few to suit to your comment. I think there's definitely an opportunity for an informatics research agenda specific to how to build and scale similar tools for patients and caregivers. I can't speak to, you know, whether a toolbox or something similar is the right approach. But I think it's a core question worth asking. So as we build more clinical decision support tools, particularly proactive ones, how do we develop a responsibility model to decide who is responsible for ensuring information gets to patients? So, I mean, I'm not sure I completely understand that question. So I think that the big need, one of the big needs now is having especially if there's regulation, like I mentioned, having that information out there, having somebody curate this information so that even smaller hospitals and primary care providers can have access to it. And, you know, maybe that becomes a commercial venue, but it would be nice to have some resource out there. I know, you know, at Geisinger and at Inner Mountain, we're looking at several different methods to deliver that to patients. And, you know, we have the patient portal, not everybody accesses that. We mail documents. We have a chatbot platform text. I mean, and part of that is, you know, you have to work through your regulatory processes as well as well. Yeah, so, I mean, patients have information now, right? They're getting their new sequence already. They're going to the private sector. So, I think the issue is building tools to help them interpret information within the context of clinical care. And, you know, ensuring the healthcare system shares results in a timely way with the necessary tools to help an interpretation and have a conversation with your clinician. Great, is there an admin on here that could able Heidi's audio? She's the one who asked this question and she'd like to clarify it. She's coming through the chat box and the Q and A. I don't know how to enable Heidi Rem, R-E-H-M's audio. Is there somebody on the technical side that could help me do that? Yes, I just enabled her audio. Thank you, Pam. You're welcome. You're welcome. Sorry, can you hear me? Yes, we can, Heidi. Sorry, the question was more around, you know, we have this issue when an actual patient's report gets updated from the issuing lab, a very direct feed that needs to go to that patient, but the ordering physician has retired left or isn't caring for the patient. Who's, how do we think about a responsibility model across the EHR so that patients will ultimately get the information they need? You know, in these settings where we're thinking about proactive, preventive, you know, notifications that have to go to a very specific patient from a very specific source. I'm not talking about vetting knowledge sources. I'm just talking about this direct feed. Right, I think that's a major challenge and that was addressed by the ACMG attorneys that I mean, I'm sure you were at the meeting. You know, where does your, where does a clinician's responsibility end in terms of caring for patient returning results if something needs to be updated and if the patient falls outside of that patient physician's care, what happens then? I know there's been some talk about having, you know, from, I won't name the vendor about having that direct connection with the patient so that they could deliver any updates to their genetic test report. I think that, you know, I don't know that I have a good answer for that through healthcare system, but there definitely should be a way to notify patients and perhaps what we need to look at is other areas in the EHR where that's happening. Or something similar is happening and look at some models that exist. I'm off the top of my head. I do not have a good answer for that one. I don't know if anybody else does. And I've always thought about it in the sense that, you know, if an alert goes out and nobody responds to it in the EHR that there is some time period before which the patient gets pinged and says you should contact your physician. There's information that might be irrelevant to you. You know, something generic but triggers the patient to ultimately have to come back in. That's always the way I've thought about it. But I feel like there needs to be some regular, you know, governance model to how we think about these types of interactions. Right, and the challenge with even that is that a lot of the patients, for example, through a geisinger, they're not, the regular healthcare doesn't happen through geisinger. You know, they might have come or to see a specialist but they actually have a primary care somewhere else on a different CHR that manages that. And I think, you know, ultimately it would be nice to have a patient have a portal their own data and have some standards there so that, you know, whatever lab could send updates to the patient about their genetic data that came up. But then again, you have all the regulatory concerns about and then the concerns about follow-up and whether they'll interpret it right. There's just a lot of challenges around that. Yep, I agree there are challenges but I think we all need to figure that out. The point being, this is an area that needs a lot of research. Yes, that was my ultimate point. And we have somebody chatting who would like to, based on Heidi's point I'd like to take that one step further. What if a result in the exome is relevant for disease that's different from the clinical area for which the original test was ordered. We need a model where the result goes to a provider that can implement care for that patient. Yeah, so this is something that I am working at on at Intermountain. And again, this is not entirely straightforward I will say because there's a lot of information and some of it won't be relevant. So we return all of these conditions to the ACMG reportable conditions and a few others but there will be other information that becomes valuable over time and there will be information that is valuable when the patient exhibits a certain phenotype that might not be valuable without that phenotype such as epilepsy is a good example. So having something work that way requires a very tight collaboration between the laboratory and the clinical space. Again, that's something I'm working on but it's not at all straightforward as I'll say. So that is an area also where a lot of research is needed. I'd like to then transition to Ben. Ben's had a couple of questions coming in and out. Ben, would you like to unmute yourself at this time and ask your questions directly? Sure, can you hear me? Yes, we can. Great, thanks so much and thanks for the chat back and forth Mark. So I guess my question is in artificial intelligence and machine learning it's very exciting to see what can be done from data sets that are where the answers are known but also with zero shot learning with applying these as we discussed to things that the computer could infer that it hasn't already seen whether it has to do with national language processing and machine learning. There's a lot of interesting stuff on paper that this could work pretty well but in reality Mark as you mentioned do you see this as something that could be tried should be tried is too fraught with issues. I was just wondering, Suba especially in the work that you were doing with cancer genetics, if you see this, are you already doing this or is this too challenging for lots of reasons? And thanks and sorry I kind of articulated this pretty badly I know there's a lot to unpack here. Yeah, I know I completely agree and I think it's a great point. I mean, I think we kind of need both approaches. One is sort of the supervised learning where we know the classes and we are predicting the classes that are already known and then the second category where you're actually letting the data tell you what the associations might be. And I think large companies like Google and Microsoft et cetera building these knowledge graphs just as an exploratory tool just based on the searches that people are doing and the other data that they're collecting. So I think we need both of those approaches for the second one we of course need very large amounts of data for us to really infer those relationships and those networks. And at least in my world we start out with these petabytes and exabytes of data but by the time you arrive at a data set that you can work with in your Python program you have, I don't know, less than 500 cases and I'll be happy if I got very good labeled data sets with 500 cases. So yeah, I think we definitely need both and I would even head to my bed and say that the second area which is the unsupervised and the exploration and more discovery needs more research, right? Because we definitely need to explore that and see what the data can tell us. Maybe those are not the associations we're looking for right off the bat. These are not the droids we're looking for either. So I tied to a response to Ben but wanted to kind of weigh in that I think that the place where I would be very excited about some research is actually at the interface of humans and AI because a lot of the studies that have come out have, you know, the headline says AI outperforms physicians at X task or Y task or Z task, which is true but the thing that's missed is that in most of those studies when you combine the AI with the physician it outperforms both the physician and the AI. And so to me, what that's saying is that the AI approach and what we do in human cognition are actually complementary. And I think that that's been very under-research and is a nice opportunity. And I mentioned in the response that Nephi and I had worked on a project where we were looking at a diagnostic decision support tool for rare neurogenetic disease where you would enter information the system would prioritize a differential diagnosis which is nothing new. But the other thing it would do, it would say, you know, if I had this information this would have the biggest impact on this differential diagnosis. If you could answer this question for me or these five questions for me I could give you a very informed answer but right now that data is missing. And so anticipating what the data needs are that could then prompt a clinician to say, oh, I should either order that test or I should add this information because I didn't think it was important illustrates how that interface could potentially be very productive. I totally agree. And I just want to add to that that the important thing there too is that with the artificial intelligence that the provider of the clinician has some information about how that decision was reached and why the machine is making that decision. Otherwise, they're not going to blindly change care if they see something that flies in the face of what the machine is telling them. So they need to be able to have and I know it's complicated and often black box but there needs to be some understanding there. And that again, I think someone had mentioned that earlier we have to be able to explain these models and why they do what they do. So I think we're pretty much at time for this discussion. So I want to thank the presenters for the session for their very interesting and provocative presentations. Thank you to our co-moderators, Dr. Kraugie and Rodin. Mostly Dr. Kraugie, huh? Mostly Dr. Kraugie. Yes, well, you know, I'm just trying to be polite, Dan. Those of us who work with you on a day-to-day basis, we all know, but, you know, we're trying to maintain the illusion for the rest of the audience. In any case, we're going to now take a 10 minute break for people to be able to get up and stretch. So we'll be back at 1.35 Eastern time with session five. So thank you very much and I get the blood circulating and don't throw an embolus.