 substantial component of electronic phenotyping, as well as consent and community consultation and focus on data privacy and re-identification risk, because these were largely repositories that were community-based. In the second phase, we then added, as we mentioned, expanded into pediatrics, as well as a large pharmacogenomics component that I'll tell you about. And in implementing that then clinically, we also needed to do a fair amount of work in clinician and patient education, and also in clinical decision support. So the two phases have emerged. The first one, the first four years, really was focusing on how repositories linked to electronic medical records can be used in genomic research. And that included these five components that I told you about a moment ago. In the second phase, we recognized we needed to expand into the childhood age ranges, which had been a sensitive topic earlier, and we've been successful in doing that. We also added the pharmacogenomics and the clinical implementation. These are the bioprositories involved in the project. You can see the sort of biped pediatric site here. Having a total of 105,000 genotype samples down here, and about 329,000 electronic medical records. And the question came up earlier where the 3,000 number came from, and it was basically the smallest of these biorepositories so that they all basically can contribute to the electronic phenotyping and genome-wide studies. So these are the two pediatric biorepositories I mentioned. And basically building on this infrastructure, as I mentioned earlier, we can proceed or follow the emerged three-prong strategy of discovering clinically relevant variants, assessing the impact of large-scale implementation on the cost and quality of care, and enabling discovery and implementation research in other biorepositories. One of the things that we made sure that Emerge did from the beginning was to make its tools and best practices and guidelines available widely to the community so that others could do this kind of work. This is just a suggestion or a listing of some of the primary phenotype gene associations that were found in Emerge One in response to Tony's question earlier, and you can see the FOX E1 association with hypothyroidism that I mentioned, but also a number of others that were first reported here. And just to give you a feel for where this fits in, our genomic medicine research portfolio, these are the four components of Emerge as they were funded over these four years. The original program was $26 million, and then a year later we added in two, three biopediatric biorepositories. In that same year we added in a large pharmacogenetics component, and last year the Office of the Director wanted to add a substantial component in ethics and consent, and so Emerge is a very good place to do that. So the total funding for this is $45 million over the course of fiscal 11-14, or about 12 million a year. And then the other programs that we have in genomic medicine, I'll talk about in a little more depth, particularly those that touch very closely on Emerge, but they're shown here. Cesar, about the same size now, sorry, considerably larger size, ClinGen, our database of clinical decision or consensus information on genetic variants that can be used in clinical care, Ignite, which is more along the lines of implementing in places that had not previously had any kind of genomic medicine implementation, such as family practice clinics and some of the military settings and that sort of thing, Incite, our newborn sequencing program and of course the undiagnosed diseases program. So back in 2007 when Emerge began, this is what most medical records rooms in most hospitals and physician's offices looked like. A lot has changed in that time now. We're looking really at electronic medical records in a very large proportion of these settings. And these are the kinds of records that I look at when I practice over at Walter Reed or other physicians are pulling up to be able to work with their patients. Lots of information in those. These are data from the CDC and the Office of the Health Information Technology, the director of that. And it was estimated in 2008 about 9% of hospitals were achieving standards for health IT incentives versus more than 80% last year, with physicians about 17% of physicians were using electronic medical records versus about 50% in 2013. So really tremendous growth there. And the anticipation is that more and more types of data sets will be added to these. This is a somewhat fanciful figure from a friend in Nature Biotech suggesting that a medical record could involve lots and lots of different kinds of genomic data sets and they warn that the future primary care physician may need to cope with a staggering array of integrated patient data including genome sequences and biological networks. So a lot of opportunity in the electronic medical record area. Some of the needs really that are quite critical in terms of genomic medicine as opposed to other forms of medicine, because genomic data are so vast, sharing them among providers and across time for clinical care is a real challenge. Any of you who've had an MRI know that you get your MRI on a disc and you take it to some other radiologist and they may or may not have the software to be able to read it, that is going to be compounded dramatically probably with sequence data and other kinds of genomic data. Updating the findings is knowledge accrues. So obviously, once one finds a bony abnormality in the brain, there's not a whole lot that tends to be learned about it, some things do. Obviously with the genomics, there's a tremendous amount that is learned about these variants as time goes by and somehow that information needs to be updated, provided to patients and their physicians. Providing clinical decision support, no physician can remember all of the 2,000 sets of guidelines they're supposed to know outside of genomics, and so decision support has been developed to address that, who needs a vaccine, who should have a pap smear, that sort of thing. But this will be needed considerably more, we suspect, in genomics because there won't be that many people and it won't be so obvious as to who has a variant unless you're actually able to query their genome. There's a fair amount of quality improvement research that could be done using this kind of decision support or querying the medical records, including reducing incorrect or redundant ordering. Intermountain health care has some data showing that the same genetic test is being ordered more than once, sometimes very expensive genetic tests, rather than being sure that you can find them and carry them forward into other, for other physicians. And basically implementing what have been called rapid learning health care systems where a system is able to query in a near real time the care that is being provided based on a variety of metrics, such as if you have such and such variant, do you avoid such and such drug as is recommended, or is that even checked, or that sort of thing. And that allows a rapid feedback that can actually improve health care in real time. These are also good tools for patient education and self-management, especially to the degree that they involve patient portals and patient accessible software. Potentially for the identification of at-risk family members, it's unusual in this country to have family members' medical records linked per se, where in other countries that is not uncommon. But it is possible through potentially social networking or other kinds of approaches to at least alert family members that there might be something they should be discussing. So when we were trying to determine, as we often do, what the future of this program should be, we held a workshop, as I mentioned earlier, in January. Asking for future directions for this workshop. And we really asked the workshop participants to try to balance this question of where should we be in terms of discovery and implementation. Recognizing that, given our strategic plan, we sort of felt that we were shifting a merge into, at least in phase two, more into the implementation space, and so we expected that in phase three it would be more almost entirely perhaps shifted into that space. As it turned out, the workshop participants very strongly felt that we should continue both discovery and implementation in eMERGE, as we discussed earlier, and then in some ways the two go hand in hand and that it would be silly to try to segment them off. And that we should indeed conduct research on implementation, that that is a place that eMERGE is really quite well positioned because of its focus on electronic medical records and the fact that those really integrate into a whole variety of different things. So we were advised that eMERGE discovery research should leverage the rich EMR phenotyping that has been a hallmark of eMERGE and that in fact it helped to develop. We should use state-of-the-art genomic techniques, so they urged us to move away from genotype arrays as being earlier technology and that really sequencing was the way to go and that's where we should focus. Assess phenotypes of rare variant carriers, so when you have 100,000 people or more than that, you can find folks who have rare variants that result in loss of function and I'll show you some data on that from our pharmacogenetic program. You also are at a great advantage when they have agreed to be in studies and to come back for iterative phenotyping that you can bring them back and actually ask, you know, what are the phenotypes associated with those. And then in sort of a new twist, we were encouraged to include an aspect of basically bedside back-to-bench research, which we have not done well, and our genomic medicine working group is recognizing this and kind of urging us in this area that potentially we could be searching functional databases or even generating functional data, although that may be something that we'd probably want to do in collaboration with a variety of other NHGRI resources and leveraging those such as SGTEX or ENCODE, but something that we were encouraged to consider. On the implementation side, we were encouraged to examine rare but collectively common variants, so variants that are grouped either in a gene or in a pathway, etc., to inform a potential treatment or diagnosis. Strongly urged to explore differences in implementation across diverse subgroups. Emerge currently contains some biorepositories that are quite diverse, others that are not so, but the mix of them actually gives us a reasonable distribution of about 25% African-American, which we are happy with, but would like it to be a little bit higher, particularly in Hispanics. Developing test approaches to re-annotation and dissemination of that information and dissemination of those approaches, as we talked about as knowledge accrued. And generate data on efficiency, cost effectiveness, and ease of implementation on cost of care. And then in terms of the, you know, the unique strength really of where Emerge comes together on discovery and implementation, really this was something that seemed to fit Emerge better than any of our other programs and really is not being filled by any other group, which is to take recommendations for clinically actionable genes that are now out in the public domain and trying to figure out, you know, which of those actually is actionable and which is not, because they were identified in basically selected people who were at very high risk. When you apply these in unselected people, how often do you come across variants? Well, it turns out very often you've probably seen the papers from the exome data and others that show a high frequency two to four percent of significant variants in those. So that was what we were urged to consider these variants in the ACMG and that need not be the only list and it certainly won't be the only list a year from now when we go forward with this, but they include hereditary cancer syndromes, some sudden death syndromes, a variety of other things that are both uncommon and quite serious and actionable. And to develop approaches for dealing with them like this sort of pop-up box of an example of clinical decision support that gives an alert that a patient has renal dysfunction, for example, and is being prescribed a drug that they probably shouldn't have, similarly a pop-up could come up that says, you know, they have a variant that makes them allergic to a baccavir and they shouldn't receive that and that kind of decision support is currently in place for a rare set of genes, but that needs to expand. And then the integrated LC infrastructure that began with the phase one of Emerge and continued in phase two should look at local differences across IRBs in genomic expertise, promote IRB education. One of the challenges that we've had in implementation in phase two of Emerge is that institutions approach this very, very differently and some are willing to be very broad in terms of what is implemented and others are quite concerned about either the medical legal implications of that or other things, concerns in their populations. That's an area we're encouraged to continue to work on, also continuing to work on risks of re-identification, talking to patients on what kinds of information they would like to have in their records and how it should be displayed. And then looking long-term and long-term will vary obviously for this program, it would be over the four-year period of the grant, what happens after returning results such as potentially behavior change, change in medications, other kinds of things. I thought it would be worth just commenting on the fact that defining phenotypes from a lot of chronic medical record data is not simply extracting ICD diagnosis codes, these are very extensive and involved algorithms and I'll go into a little more detail on some of our findings a little bit later. But there has been a really strong replication of genotype phenotype associations based on the medical record looking at associations previously published in other genome-wide association studies. And then that's actually been sort of turned on its head in the phenome-wide association study where one variant or strong variant has been then looked at in a variety of medical record phenotypes to determine what else it's associated with and could those be hints then in terms of other diseases or other functional pathways? In addition, this paper in Nasier Biotechnology in 2013 showed a systematic comparison of looking at both phenome-wide association study data and genome-wide association data coming up with basically comparing the hundreds by hundreds matrix and identifying a number of new associations that I can show you a little bit later. And this was an interesting approach on mechanistic phenotypes that was looking specifically at if one looks outside of electronic medical record phenotypes into other ontologies and other functional kinds of annotations of genes that have been associated with various phenotypes, what hints could you get in terms of disease mechanisms? And this is just to give you a hint on what some of these algorithms look like. Many of them go on for pages and pages of code and they are iteratively developed. They are then validated against a clinicians diagnosis and then there's a back and forth across multiple sites. One of the unique things about these is that they actually are designed to be transportable across different medical record systems. And this is just the phenotype workflow. I won't go into it in a great deal other than to say that there's both creation, validation and sharing of these. These are the various tools that are used in doing so and then publication. And this just gives you an idea of one very powerful way that one can use longitudinal data shown here, data from 25,000 African Americans for up to 16 years. And this is estimated glomerular filtration rate, which is a measure, probably the best measure, most commonly used measure of renal function. And what you see is the level below which one is viewed as being abnormal or having chronic kidney disease and then the normal range. What Erwin Bottinger and his group at Mount Sinai did was to basically take these data and cluster them and they came up with nine clusters. I'm only showing you four of them here. But they were clustered using the same kinds of algorithms that we use for genotyping data or sequencing data, et cetera. And found four main groups that they then related to numbers of risk alleles and other diseases. There we go. And then the physician diagnosed chronic kidney disease and acute myocardial infarction as a complication of renal disease. And as you can see, the group of normals here tends to be normal on these and have low risk of complicating diseases, a group that could be identified as rapidly progressive with chronic kidney disease had much higher rates of ApoL1 as did kidney transplantation patients as did end stage kidney disease. So these are just an example of what can be done with longitudinal phenotypes in these kinds of programs. And these are tools that emerge, has developed and made available, the PKB includes the phenotyping algorithms and instructions for use and also encourages deposition of others phenotyping algorithms into it for further sharing. This phenotype harmonization tool actually allows mapping across multiple ontologies so that other investigators and other kinds of fields, the NCI thesaurus, for example, is there, the SNOMED CT, et cetera, so it doesn't have to be a, sorry, an emerged phenotype in order to be useful. And there's been a large effort on physician and patient education. This myresults.org portal which is designed mainly for patients but also used by physicians. You can click on the your results and get actually a little bit of a tutorial in this kind of work as well as other resources recommended websites and actually some nice instructional videos that were developed by the emergent investigators and the number of those is growing. And then these are papers that have been published on Consent Privacy and Stakeholder Concerns from the Consent Community Consultation Group as well as now the Consent and Regulatory Concerns Group. One of the seminal ones was actually this glad you asked paper, participants' opinions of reconsent on data deposition into dbGaP. So early on with dbGaP, we made the assumption that if people didn't mind that we were using their data that it was okay and we didn't have to really consent them. And what they did was to go back and ask. And basically almost all participants said, sure, you know, I'm happy for my data to be used in this way, however, I really want you to ask me before you do that. There has been work on anonymization. Returning results, it was actually a joint paper between Emerge and the CSER actionable results group and work on stakeholder engagement trying to look at other groups that might be involved in this as well. These are the site-specific genomic medicine implementation pilots. So we asked the groups to propose when they came in for a Merge phase two what kinds of implementation pilots they would do. They were pilots because they needed to be relatively small to put together all of the added infrastructure that's needed for doing this, for consenting people, for counseling them on their results and following those up and educating, et cetera. So those are those programs. What I really wanted to do was to give you a feel for the Emerge PGX project. We've talked with you about this briefly as well, the Pharmacogenomics Research Network provided to us an array, a state-of-the-art pharmacogenetics array, as well as a number of other areas of expertise. And then Emerge provided sort of a large population, less focused pharmacogenetic labs and the electronic phenotyping as well as work in the privacy concerns. The idea was to deploy this particular array of 84 known pharmacogenes, recruit patients and obtain the data, then selectively implement genotypes in the electronic medical record as the institutions became comfortable with doing that and they could be convinced to do it. And then to develop a repository, the Sphinx repository for these data that the PGRN group could then go back and analyze. So there were A4 VIP genes as they were named. They were sequenced with all of the coding sequence plus an embelgen capture of the introns so that you could get the splice sites plus CYP2D6, which metabolizes up to 25% of prescribed genes, was captured in its entirely with the introns. There's upstream and downstream sequence, some probes for intron and non-coding sites. The average read depth, as we were talking about earlier, is higher. It's about 400, almost 500X for this kind of an array and it's highly concordant with existing HAPMAP data. And then the implementation was based on a series of guidelines published by the Clinical Pharmacogenomics Implementation Consortium, or CPIC. There are many, many more of these even than shown here, but the ones that were sort of picked initially were the widely recognized and accepted clopidogrel simbastatin and warfarin in mainly the adult sites, as you can see, because that's where they tend to be used. And then the pediatric sites used a smaller version, or some that were unique to pediatrics as you can see here. And our hope is that as each of them implements these, develops the education material, the clinical decision support, et cetera, that they share it with each other and that is happening so other sites are bringing other drugs online and eventually we hope that they will all at least hopefully be doing more than they're currently doing. I wanted to share some initial results on this from the first 2,000 patients that have been sequenced in eMERGE-PGX, 2 genes that are associated with sudden death, and so were of considerable concern in both among the eMERGE investigators as well potentially at their clinicians and patients. There were 83 rare, that is a minor allele frequency, less than 1%. Variants identified in SCN5A, penasium channel gene, and 45 in KCNH2, it's not unexpected that one gene is about twice as big as the other. 121 of those had a minor allele frequency of less than 0.5% and actually 92 of them were singletons. Three labs were then asked to assess their known or likely pathogenicity and one lab called of the 128, 16 of them is being known or likely pathogenic. The second lab, a little bit higher yield, 24 out of 128. The third lab back down a little bit closer to lab number one. But what was most worrisome, of course, was that only four were called by all three labs in the same way. So what we did then was to take the 40 variants that at least one lab had called this being pathogenic or likely pathogenic. 48 people were carrying those variants and their electronic medical records were reviewed. Five of them had cardiac conduction defects that are quite common in the population and are not really strongly known to be associated with defects in this gene, but they may well be. None of them had a history of what's called long QT syndrome, which is what these genes are associated with or of cardiac arrest. No family history of cardiac arrest. But really the best phenotype here is the QT interval shown here. It's prolonged in people who have these syndrome, Raman award syndrome and Bergata syndrome specifically. And of these 48 people, only one of them had one measured QT interval that was abnormal and that was during a time when she was metabolically abnormal and that could have prolonged her QT. Interestingly, the variant in this patient was annotated by the three labs. One of them called it pathogenic, one called it benign, and one called it uninsignificant. So you can see the problem that we're facing when finding these and I think the experience in CSER and other programs that are beginning to do sequencing are coming up in the same way. Another challenge with this is that 12 of these 48 people had no electronic, no QT EKG in their electronic medical record and they weren't all children. Should they be called back to have their EKGs run? These are important questions. So the clinical implications of sequence variants, variation are really quite vast and variants with presumed detrimental impact on gene function are often found. The phenotypic and clinical implications of these and people who are unselected for disease or positive family history are largely unknown. The burden of reporting these is about 2% in two genes for 2,000 people in this first pass and emerged. So when one gets up to 56 genes, nearly everybody is going to have something if you don't have some way of culling those down. And so reliable information is really needed on the phenotypic manifestations and you need large numbers of people to do that because these variants are rare. And it is also clear that if you know you have a variant and it's a heavily loaded family that certainly helps to point you in the direction that it might be pathogenic. So getting then to what we're proposing in phase three, we want to continue the genomic medicine discovery and implementation research using large biorepositories with electronic medical records to identify rare variants with presumed significant impact on function, this loss of function variants. In what we estimate to be about 100 clinically relevant genes, there are 56 in one group's list currently. Others, Jim has a longer list, University of Washington has a slightly longer list, etc. And this will evolve over time, but we're thinking about 100, seems to be a reasonable number. Assess their phenotypic implications and then with appropriate consent and education report what actionable variants there are and there will be debate and decision making about that to patients, potentially to their families. And that's a researchable question as to how one goes about doing that and their clinicians. And then also assess the impact of patients, clinicians and institutions. The Emerge 3 would continue previous aims of Emerge, including expanding and enhancing electronic phenotyping, providing clinical decision support, integrating genomic findings into the medical records for ongoing clinical care and research, engaging in educating IRBs or health system leaders as well as EMR vendors. EMR emerges well positioned to have sort of a collective voice to get EMR vendors to really engage in genomics. They have not so far, even though there have been a lot of promises made about implementing family history or implementing some candidate gene information and disseminating methods, tools and best practices. And the proposed scope is described in the concept is 8 to 12 clinical sites, plus a coordinating center, plus genome sequencing. And we said, and genotyping facilities, we don't anticipate a huge amount of genotyping being done, but we want to have the capability and we don't think that would be a problem for anybody that can do sequencing. We propose 2 to 3,000 DNA samples per site that would be sequenced for 100 target genes within a CLEA environment, so that requires a lot of sample tracking and consent and validation, etc. And we anticipate that the protocol would be sort of laid down in the first year with review by our external scientific panel and could evolve as the technology evolves. We'd also want to explore potential bedside to bench functional assessments leveraging existing data resources and perhaps generating others. And then also to expand the phenotyping library, we actually proposed to double it and in reviewing with council reviewers, they suggested that might be a little bit too ambitious and so maybe in the 60 to 80 range. And then the criteria for site selection, again, listed there include population diversity, availability of high quality GWAS data, and that is one that we'll discuss, obviously. Availability of patients for CLEA sequencing and return over results. Completeness of the EMR data in the biorepository that's being proposed. And the ability to implement existing and merge phenotypes. And then also a broad range of disciplines and expertise. We were encouraged to sort of look at new applicants who might have more strength in population diversity or key expertise, key areas of expertise. While smaller ones may not be able to implement everything that's needed and emerge in order to be able to participate very actively. So they could conceivably partner with other sites as has been done in phase two. And then continue to evaluate existing sites on their ongoing productivity and their collaboration and emerge to. So just to come back then to this question very briefly on how this all fits. We have really a sort of a spectrum of genomic medicine implementations. We talked about a little bit earlier that there's, you can look at this on kind of two axes, the depth of patient characterization versus a breadth of implementation. And those are in dynamic tension. It's hard to do both of them to the max. And so we need probably a mix of programs to do this. The undiagnosed disease program in the end site, the newborn sequencing program. Both are using wide-scale sequencing as well as very in-depth patient characterization in the UDN, the patients are admitted for a week. And so there's a lot of phenotyping that goes on with them. Caesar is perhaps a little bit less deep on that but still very much sort of a dive into an individual patient. And they're focusing really quite heavily on individual patients testing multiple models for doing this across different sites. Whereas Emerge and Ignite are on the larger scale, they are focusing more on evidence generation and system-wide impact, as well as dissemination in diverse settings. So that's how we see them kind of lining up as a whole, looking at the two programs that Emerge is most closely related to. You can see the size and scope of Caesar here. Emerge is much larger, 100,000 patients, also 10 settings. But in Caesar there are diverse clinical scenarios that are really quite focused in Emerge, they tend to be much more network-wide phenotypes and shared network-wide genotyping. The focus is the individual patient clinical encounter, where here the focus is much more system-wide implementation. In Caesar there's a focus on individual phenotypes versus a broad range of phenotypes. As the individual sort of on a case-by-case basis, and you'll hear more about this from Jim, it's much more of a phenotype to genotypes that people have given characteristics. You try to find out the genetics that are associated with that. Where in Emerge we have the opportunity to take the genotype and look forward to what phenotypes it's associated with. There's exome genome sequencing, and Caesar to date Emerge has done genotyping and targeted sequencing, and again, that could evolve over time. Caesar is focusing very heavily on standardizing sequencing reports as they come from labs so that we don't just get PDFs being put into medical records, but actually something that is computable and usable over the future. Emerge is more focusing on standardizing the electronic phenotypes. And then these are areas where they touch and are working together in terms of some aspects of medical record implementation, clinical and plaque impact, pediatrics and data sharing. And then lastly on Ignite, emerging and shown here, Ignite is up in the size range of Emerge, 50,000 patients, five projects currently. It's in very diverse clinical settings, not expert or leading genomics expertise groups. And the focus is really on real world application as opposed to more on the evidence generation and developing approaches that could then be implemented in the real world. Emerge tests novel implementation models versus disseminating current implementation models. There's the GOS genotyping and targeted sequencing mentioned here. And Ignite has actually a very broad range, so some groups are doing family history, some are measuring a single variant, some are doing a beginning to do sequencing. Developing and assessing CDS tools or clinical decision support is a major focus in Emerge, whereas in Ignite it's much more deploying these in diverse settings and then contributing to the evidence base, particularly on the pathogenesis of, sorry, the penetrance of pathogenic variants. Whereas the evidence base in Ignite will be the effectiveness of implementation methods. So, and then these are the areas in which they touch. So I think I'll stop there and thank the Emerge investigators and particularly Rongling and Jackie and my staff at Simona and Ken are also working on the pharmacogenomics component. And stop there, I can go into David's questions or maybe we can ask, I think some of these may have been addressed or you may want to have other comments, but perhaps before we do that, it might be best to call on council reviewers. So Howard shares the ESP for Emerge, might be good to start with you, Howard, and then we can kind of go around. Okay, thanks, Teri for that update. There's a couple of things you didn't mention that because we've reviewed this further recently as the external scientific panel, there were 332 publications that have come from this network across the spectrum of areas that you highlighted. So it's been a very productive group. Another thing that I don't think came out quite in the slides that we presented them is that this group really in my mind more than most took on what do we really do? It was like applied, Elsie. Like what do we actually do? Not what do we theoretically do? What might we do? What would we actually do? And the implications of that have been that hospitals and health systems that could care less about research have been able to use that information to make much more objective and reasoned approaches to how they implement genetics. Whether they should implement it or not is a different story, but they're doing it. And so I think that element there. And I'm not a huge Elsie fan, but I've been really impressed by what's come out of this effort because of the way it was done. The science of implementation seems like an oxymoron. I now understand that it's a real thing, so I appreciate that. The discovery part also, you had a limited amount of time, but there's been some really fundamental discoveries that have been made about mechanism of disease, not just an individual disease, but a class of diseases from an individual organ. So you weren't able to show it, but the results that hypothyroidism was associated with FoxE1, but that a bunch of other thyroid diseases also came from that same, I guess I should have, you anticipated my comment. So that discovery is nice, but then I don't know if you have the phenotype. So yeah, so they're basically going into phenomide. Basically by looking phenomide, now there's a lot of thyroid disease that have been implicated to this one. So there's now a new way of approaching the entire organ that was not there before Emerge started. And that's something that nobody could have done otherwise or didn't try to do, I guess, maybe should say. And so I guess that part of, I think, has been quite exciting. It's still clear to me exactly where this is gonna end up in terms of this balance between research and application. And I initially thought that was a problem in terms of even having an Emerge 3, but I've come to the conclusion that the fact that I don't know which way it's gonna go is why we should have it. Because if you look at a lot of other disciplines that have been left to market forces for want of a better term, some of them have done very well and some of them, we're in a bunch of crap because of where things went. Water flows downward. Doesn't go up most times. And so I am supportive of this because of that. That's not so much a question for you, but some comments. Okay, great. Oh, thank you. So maybe we can just go alphabetically, Eric. I think you were a reviewer as well. So I'm supportive of the program and also what I did is I took it upon myself to go to DbGaP and try to understand with Rongling's help what data's being made public. I thought that was an important part. And I have to say I was pleasantly surprised about, it's not something that the program, I think, is trumpeted to the community of the data that's being put into DbGaP. So I think a tick mark in the project's favor is the availability of data to the community. So others, not just emerge investigators can begin to mine these data and look for the relationships that Howard alluded to. By the way, as Howard's counsel, I like to say he's highly committed to LSE issues in search. He would like to modify, I think, some of the research methods, but he's committed to LSE. I think the other point is clearly that this is a direction of the future. So I guess I come down on the side of the implementation part. I think there are other projects that are doing discovery. But where that balance is, let's just let it be what it's gonna be and let the data show us over time and not try to second guess. My only concern is, I think as NHGRI, we need to be promoting genomics in science and in practice and in society. And I think limiting it to a select handful of candidate genes I don't think is forward looking enough. I know there are cost concerns. I know there are issues about returning of results. But I don't think the solution to those issues is to run from it. I think the issues to it is really to push genomics into the project and to deal with the issues. And we're all anticipating in the coming, I'll say year, continued cost reduction in genomes and exomes. And I think Emerge should leverage that and really push it into the project. Great, thank you, Jim. The main thing that I would point out, I think that Emerge has a lot to offer and where it has the most offer, I think, is in this interface with the electronic medical record. We are seeing as the slide, two of the slides you showed, a tremendous increase in the EMR. That's a huge headache for many of us in practice. But in the end, if it is done right, there are unprecedented opportunities and Emerge is really, really well positioned to mine the kind of data, the kind of information that we need to mine. Because that information is being collected all the time. The problem is it's being collected in extraordinarily haphazard and sporadic ways in clinical encounters. But with the EMR now, and especially if vendors are engaged, there are tremendous opportunities for beginning to associate genotypes and phenotypes. And, as I'll mention in my talk, I'm getting to some of the things that remain a fundamental mystery, which is surprising in some ways, like penetrance of very well-studied genes. And I think that both Emerge and, as I'll mention, Cesar are well positioned to answer some of these questions. So to me, I would hit very hard on the interface with the medical record. As I'll mention, Eric, I think your point is a good one that yes, we need to do whole genome sequencing on lots and lots of people. I think it remains to be seen whether we need to do that in a medical context. And I'm not really convinced of that. In a research context, we do. And I think we have to hit the right blend. So are you saying Emerge is not a research project? No, no, it is very much. But I think we have to figure out where we... That's right, that's right. I think we have to figure out where we put resources in the sense of whole genome sequencing and where perhaps there's more bang for the buck in more targeted approaches. Especially if you're using them clinically, you want to be sure you've sequenced them. Well, yeah, yeah, that's right. And then we have Shanita and Lucila, so Shanita, please. So I'll just start by saying if Emerge has convinced Howard to have greater respect for LC issues, I'm all behind it, even more so. Um, but... Greater respect, the respect... Okay. That's right. Um, but I would just echo the positive aspects that have been already stated. I, um, during sort of our previous conversations, had questions about sort of the balance between discovery and implementation. And I think that your presentation addressed those very nicely. I would encourage sort of greater focus on implementation because I think that there's a huge amount of information that's been knowledge that's been generated. And I think this group is ideally poised to address how you sort of translate that into patient care and sort of understand the consequences, both positive and negative of that. I think the other thing that's a strength about Emerge is that it's... The consortium addresses a diversity of clinical issues, medical issues that are rare and sort of those that are more prevalent in the population. And I think that's a really important addition that needs to be continued. And I guess the other thing, as you were talking, one of the things that struck me was, you know, a question around how do you resolve... And this seems to be the $64 million question. I'm trying to remember that show, but I think I've... It was 1,000 back then. Okay. Yeah. Yeah. Yeah. Um, you know, but an issue that cuts across, I think all of many of the consortium that NHGRI has established is, well, how do you resolve the discrepancy between the interpretation of variants across laboratories? And to me, that's, you know, that's an issue that all of the consortium are facing. And I don't think I've heard any conversation about that. And there's a... I think that's something important needs to be addressed. Agreed, yeah. And I would hope that Emerge and Caesar will both be addressing that. You heard at the Caesar meetings last week, you know, a tremendous debate and discussion about how one goes about doing that. One way to do it is to look in the medical records of, you know, hundreds of people and say, how many of them actually, you know, have the phenotype, the feared phenotype with which this is associated? How many have positive family histories of it? You know, how many have absolutely nothing? So that's what that's... Yeah, I would actually argue that the example you gave kind of may have answered that question for many of those variants. And that is when you look in lots and lots of medical records for variants and you find no hint of a phenotype, it's telling you something, right? So I think the negative information can be very... Useful, yeah. Great, and then we'll still last and then Joe. All right. So the good thing about being last is I don't want to repeat all the good things. And I do agree with the uniqueness being in the electronic medical record connection to the genetic findings. I also think that perhaps we're underappreciating the difficulty which is for doing phenotyping by itself because when you say look at the medical records, these are clinical notes narrative format that you have to extract the information from. So it's not that easy or even with the structured data to find all patients with diabetes type two should be a simple click, but we know it isn't. So all this recipes for phenotyping, I think are a great contribution as were the natural language processing items that were not mentioned, but I know from the publications. So I also agree the implementation is critical and that's why it's at our face in the electronic medical records. So it's a great program. Great, thank you. Joe? Yeah, so I think a question that was raised earlier but that didn't get addressed here in this very comprehensive presentation that you just gave Terry. Is this question again of 100 or so targeted genes versus doing say whole exome sequencing? And I probably should have these numbers at my fingertips, but I don't. I'm not sure what exactly is the difference in cost in doing those two different approaches, but it just, it seems to me that we're talking about low penetrance variants in any case, rare variants, and it seems even more reasonable that the variant may or may not have a phenotypic effect depending on some other variant. Now whole genome sequencing at the moment, of course, is much more expensive but you're gonna have to do some kind of hybrid capture or something like that anyway. And I just wonder if you really wanna get the full power to consider doing the whole, at least the whole exome maybe if you need to reduce the number of centers or something like that. I'm not sure what the trade-offs are Terry, but I just think that that's gonna be a really important piece here. I mean, environmental things and so on and so forth. Maybe you'll get that out of the medical records or not. And I think the point Lucila raised is absolutely right that the whole ontological NLP issue around these medical records could kill you, might not, but the people I know who have done some very interesting work in pharmacovigilance, for example, have spent huge amounts of their time doing that kind of study of the records themselves and trying to get them in much better shape for data mining and other things. So you raise a couple of points, do you wanna address? I was just gonna say, I mean, I think you're right. The issue is where council and where the NHGRI ultimately comes down on this question of implementation versus discovery. Because if you're really talking all about discovery, then it makes abundant sense to cast a genome-wide net. On the other hand, if you're talking about implementation, there frankly aren't that many genes in the genome that matter to people's health that we're aware of. So if you're talking about implementation, you can get a whole lot more bang for the buck, I think, by confining yourselves. And I'm not saying you should do one or the other, but it really does come down to where we land on this issue of discovery versus implementation. Oh, no. Click, click, that'll commercial, they're both right. I mean, I understand what you're saying, but I also say, Joe and I probably illicitly were having a little side conversation here. Depending on cost, does it make sense not to do this broader coverage and then to bring somebody back to for some very expensive downstream testing or even be invasive? I mean, I don't know. And I'm not coming down one way or another because again, I don't know what the economics of the situation are exactly, so. I can shed a little bit of light on that. So it's about a three-fold difference based on the targeted genotyping that we're doing for EmergePGX, which is 84 genes plus as I showed you the introns and then a couple of genes done completely. So about $600 versus $1,800 today, obviously those costs are gonna come down. We're the whole genome, the cost we're having in our other programs is about 4,000 or so. We're not talking about whole genome, we're talking about whole exome. Whole exome is 1,800. And you're talking about 100 plus, right? Right, so 1,800, but remember these have to be in a clear environment and validated and that's what that more doubles the cost or even a little bit more than that. This is a really tough question and I welcome others' thoughts on it. I think what we may need to do is to write a solicitation in such a way that we are open to a variety of approaches that need to be justified by the investigator. Rudy is nodding, oh, excellent. I've learned from you, UC. So that may be... I think the other thing coming on Lucila's comment, the discovery aspects of Emerge and the effort put into discovery in Emerge in large part are due to the electronic mining. Developing these phenotypes, each one of them takes months and months and months and lots of people involved and several iterations across multiple medical record systems so that they're interchangeable so you can use them in multiple different places. And those are all preparing for discovery, if you will, but that's the effort that goes into doing the discovery in Emerge. Other questions? Getting back to your comment about implementation, that those records and the ontological and natural language processing issues around those records are just as important for implementation as they are for discovery. Right there, but in my mind, I see it in kind of two phases that we do know a handful of genes that actually do seem to be important in clinical medicine and cutting our teeth, which we need to continue to do on those, might make sense, and then as costs drop, casting a wider net. But again, you're right. Both of those views are right, but we are gonna have to make some hard choices because it's a zero-sum game in the end. I'll step into this moment of silence. We do need a vote for a concept clearance, and I guess I'll ask to accept the document that Terry made available. I've heard a couple of comments along the lines of giving some flexibility to allow the applicants to propose their own balance of discovery versus implementation. And in terms of the breadth of sequencing that would be done, are you comfortable allowing that latitude rather than trying to pin that down here? Of course. Before you go there, Terry, you had teed up some slides. I think you attributed the questions to David. A series of questions there. Were you gonna go over those also? I can if you. I mean, I'm not trying to cause you trouble, but on the other hand. Well, Rudy was telling me, I've already teed away over time. Never mind. How about those national? Yeah, I don't know. David, if we need to go through all of them, or maybe you can... I guess I was assuming that since you posted them, you were gonna address them. Sure, sure. And I was hoping... Rather than my needing to repeat them. No problem. I was hoping that some of them were addressed so you were gonna tell me which ones you were happy with. But in terms of why NHGRI needs to stimulus this or 10 years from now, if we didn't do it what anybody noticed. I think the key question that a merge can address here is what are the phenotypic implications of these rare variants that are being used clinically, that are coming up constantly, and nobody knows how to use them, or what to report or what to implement on. So that is kind of the overriding issue. In addition, the electronic phenotyping and clinical decision support being done as a network so that it is transportable and usable across multiple systems is something that isn't being done elsewhere. I guess at these council meetings, we're constantly... The trick with considering any given concept clearance is in life we're always considering alternatives. The problem is when we debate the concept clearance, we take this question in isolation. Somebody has to speak for the unborn. That is, what is the unnamed alternative? I think that with every grant or with every concept, we have to ask ourselves and really rigorously frame the subtraction question, which is, if you really look out across the fullness of time, and of course we can't know, we don't have crystal balls, but is this a space where NHGRI has unique strengths and can make a truly unique contribution that is unlikely to be substantially filled by other parties? And earlier today, I think there was at least some acknowledgement around the table that there is a good bit of activity in related spaces, maybe not exactly in this space, but in related spaces with large private investments. If it's appropriate to talk about that when we talk about DNA sequencing technology, surely it is appropriate to talk about it in this context. And so I would say that from my point of view, it's not obvious that NHGRI has a unique compelling contribution to make in this space. So I'd love to hear some debate on that question. I don't think it's a trivial question. I think it's because of course, by committing a large body of funds in this direction we are implicitly, not explicitly, but implicitly directing away from other things that NHGRI might fund. But I mean, one question I'd have is, how fair is it to ask the question what has the emerge effort to date brought us? And was there a similar amount of similar work being generated with non-encode funds in the same area? And I think the overwhelm, and now obviously you don't know what the next five years will be, you can only say what the last five, but there's been an overwhelmingly productive amount of publications and major advances by encode-funded work, and I'm not sure that's even been matched by non-encode, I mean, emerge, emerge-funded work, and I don't have it being matched in any way by non-emerge-funded work in that area. So as one yardstick to do this is to look back a little. I mean, I think it's pretty impressive. Maybe one way to answer that question is if the private efforts are working on how this relates to their institution, how translatable is that across the many other hospitals that don't have the resources to do that? Emerge, if it's successful, and I think it's on the way to that, should provide that translation, as you mentioned. That's the expectation. Joe. So Rudy, you said, I forget if it was Rudy or Terry, you said something about allowing more latitude in terms of whether an applicant said they were gonna stick to this hundred-specific list of genes or whether they were gonna do exome sequencing. Obviously, that would affect the size of their proposals. And so doesn't that mean that you're also gonna have to give some latitude in terms of the size of proposals which will then affect the number of centers you'll be able to award and so on. These things are not independent. And I just wanna make sure that if we're really saying, yes, there's latitude about what approach you're gonna use in terms of are you gonna do whole exome, are you gonna do 100 targeted genes? Yes, that will be peer reviewed, but then the whole exome might need x amount of money to get the same power, the 100 genes might need. I mean, I'm just trying to understand what tend to be very practical, how this is gonna work in terms of the whole program from a practical point of view or are we really saying, no, you're gonna, if you do a whole exome, then you have to do many less patients or you have to do this or you have to do that. I just wanna understand. So I was, I don't have those answers for Jill. I'm not sure that we can have them in a week. I was responding to something Eric said, this is how you drive a field forward. You put a challenge out there and some are gonna send in very safe applications, narrowly focused on 60, 70 genes and other people are gonna find more creative or more effective or more efficient ways to do things, which will put their application a little bit at risk in terms of the peer review and then we'll be back to counsel for your guidance when it comes time for a funding plan. That's what I was envisioning here. Jim, did you wanna comment? Let's ask a simple question. Did the patient consent preclude going back to samples you collected? These would almost certainly, yeah they would almost certainly need to be reconsented for this kind of work. Yeah, but to address Jill's question, Jill to have a sort of a level playing field and we need to be careful not to be writing the RFA in this, in counsel. No, I understand but we need to have a sense of how that RFA is gonna play out to approve or not approve the concept claim. So one way to think about it is that just as an investigator needs to make tough choices on cost and scope, so does the institute, so does the broader world. And it seems as though somebody who's proposing to do a much broader exome or genome project is going to have to focus, they're gonna have to compensate by having fewer patients and then make the case that they are actually doing something that is useful in clinical implementation as well as in discovery. What we would anticipate is at the end of the day everybody would do the same thing unless there's a really good reason for it to be done differently because as we find and in Emerge it's really the power of having the large numbers that's the strength. So CSER is a program that is really focused on multiple different models. Let's try the different ways, different sizes, et cetera, in Emerge it tends to be much more of a network-wide system-wide approach. Back to these questions, okay. So the question about requiring existing GWAS data and again this is something that we're struggling with and would welcome your advice. When we invite other groups to participate in Emerge who are not funded as part of Emerge we expect them to participate by contributing data and being involved in the various discovery aspects and also taking Emerge phenotypes and applying it to their data. Well the only way you can really tell what they work is if they get similar kinds of association. So that's the reason for having that. We could conceivably make the numbers less or whatever but we'd welcome your input on whether to keep that. I would suggest decreasing the numbers at least because to have more proposals by different institutions in addressing perhaps some phenotypes that are less common you would need to decrease the numbers. Although your power really is going to drop off if you're trying to validate a phenotype in an existing, so we can't get too small but maybe we can play with the number a bit. Maybe we phrase that to make it clear what you want because if somebody has 10,000 clear level whole exomes worth of patients and they think oh I don't have GWAS data though it's whole exome data, oh well I guess I won't apply. You know what I mean, nobody would do that but just be clear what you want because you don't want GWAS. You want data on which to make discovery and do implications. Oh that's a great idea. Should there be some floor or just you make the case? You investigator. I can't imagine making a solid case for small numbers. I mean how would this one do that? So nor can I justify 3,000 as a magic number. So I. You know people are going to call and they'll call my colleagues and say I have 873 is that enough? And we could. Say no. We could. They need just the power calculator. Yeah. I mean peer review. Okay, we can throw it on peer review and that would be fine. All right. Okay. Let's see, significant achievements. We have done the best job we could in describing what those were. I have other slides that describe others and I might note that we have external scientific panels for all of these programs. Howard shares the one for Emerge. They look over the productivity that papers are coming out. They suggest directions that they should go in. They've been reasonably happy with Emerge so far even though Howard doesn't like Elsie, but no he hasn't. Anyway, one of the interesting things about Emerge is that the Elsie component is really integrated throughout. It's not sort of a separate thing. We insisted from the beginning that this I'll be part and parcel of the work that's being done. And I think that's been one approach that we found successful in this program. Okay. Distinguishing features breadth. And again, yes, yes, that's right. How to judge the success or failure. I think one of the reasons that we have broad programs is that we are trying to make them applicable and relevant to lots of different places. And so we expect them to disseminate their tools, their approaches and to those that will be taken up and used. So the PKB is being used quite a bit as is LMAP. The Emerge Consent Forum I believe is also widely used and some of the results, MyResults.org, et cetera, just started so we can't really judge them. And then, yes. One of the things, because this has come up multiple times not just today, I think having, it may be worth putting in the document somewhere that the group should define what the metrics are for success because it is hard, but it's not impossible. I mean, you could define it and it could be that by defining what metrics you care about, it makes it clear what your application is really trying to achieve. I do think that the group, certainly in the past two phases, the group has had to define milestones and when they didn't meet them, the ESP came down hard both financially and verbally and then other times, they met them, cleared them quite nicely. So I do think having, you know, there's an opportunity again for the marketplace to decide what the metrics are or at least contribute to that discussion. Good point. So I think my, the reason I raised that question was in a sense when I read the concept clearance document, it is, I mean, the goals of Emerge one, two and three are very lofty, they're very broad, they're very ambitious and in some sense, I'm trying to ask the question, let's say, we go forward with this and four or five years from now, when one looks back upon what will have been by then, it's already seven years in the making, right? We're already seven years into Emerge. I mean, if we're, you know, after 12 years of Emerge, how are we gonna know that this actually made a difference? And in what of the many domains that are touched upon will it be easiest to point to the success or assess the failure of the program? I believe, I do not believe that the number of publications is an at all useful measure of the, I mean, if you give people money, they're gonna publish papers and those papers are gonna attract citations. So I don't think that's, for me, that's actually the opposite of an answer to the question of how we'll judge the value of a program. And so I, and also it's a way of, I think, as Howard is suggesting, of focusing the mind, right? As we talk about it as a concept clearance now is, you know, if it goes forward in thinking about how applicants think about it, as we will talk about this with other groups of scientists and physicians, you know, I'm almost looking for what's that, how will we know that in the fullness of time that this thing has really succeeded? One of the measures, and I agree with you, publications aren't the only yardstick, of course it's better than the opposite. I think that the consortium has all this money and they're not publishing it at all, I might be a little suspicious. One of the things, and again, it's totally anecdotal, but maybe a lot of things, is how much is a consortium like a merge being discussed in other venues and other contexts by other people, is including outside the genomics community. And anecdotally, I hear about this all the time. I mean, it was interesting, Terry went over very quickly, but all of a sudden, you know, officer director wanted to do a study, needed to do a bioethic study, and we were in campaigning for more money, in fact, I'm not even sure we wanted it. We're happy to have it. All of a sudden, this became the venue for doing something that was needed to be done. Discussions just around genomics, the implementation of electronic health records it just seems like lots of people who have no vast interest in bringing this up, that it's contributing, that it's the go-to place for some of these questions, and certainly for the expertise. It's not, it's just yet another measure, and it's anecdotal, but I hear it all the time. Or it being relevant, and that a merge is a relevant part of the conversation for some hard problems around implementing genomics in medical care. May I suggest also if there could be a baseline of how many institutions currently have clinical decision support on genomic data, and how we'll be in four years from now, that would be a measure, that perspective we can get. And it's just like there were 8% institutions with electronic health records that there are not 80. So if you find the same thing, there were 4% and now there's 40, it's probably at least an objective measure of how the field in general is doing. I might not have been because of a merge, but you can probably trace some of those aspects back to it. In a sense, it's actually quite related to my last question, which is at least in the language of the concept clearance as we saw it, the goals were framed in terms of health impacts, which I interpret to mean health outcomes for individuals and cost effectiveness. And so I'm not hearing that, in the answer to the last question, I'm not hearing any of that as part of the metric by which the success or failure of this program will be judged. So I don't know whether that's, whether those words in the description of the concept clearance, whether there's meat on those bones, or whether. I was just gonna say, some of this conversation might bleed over into a post-cesar conversation too, because I actually have several slides on outcomes, et cetera. And one of the things I think council will need to do that's really hard is to figure out how these different efforts mesh. And so I'm just throwing that out there that maybe there'll be an opportunity to discuss the same issue in a few minutes. No, I think that's right. And in addition, I think a lot of this is in the hypothetical, David, we're just beginning in some ways to find, you have sequence variations. So we have these 44 or 48 people or so. All of them have these variants that at least one lab thinks are important. Do they keel over dead? Did anybody in their family keel over dead? Do they have the phenotype when we, if we stress them a bit, give them drugs that prolong the QT, do they respond in that way? All of those are health impact. What's the cost effectiveness of actually investigating those people if you prevent one death, is that a good thing, a bad thing, et cetera? What's the cost of doing this kind of testing? What's involved in implementing it in the medical record? Those are the kinds of things that we would do with these actionable variants that would get at that health impact. When changes in medical decisions and adherence to guidelines, those kinds of outcome inputs are attainable. Number of babies cured is probably not. Not. Or at least it'll be a difficult number to come by. For instance, in this particular case, if I could just comment, the one that I described for you, it turned out that that patient had had a lung QT, her clinician had not been aware of it, and when made aware of it because of this genetic finding set, you know, I probably should watch what I prescribe this person. Now, probably they should have been doing that anyway, but that's something that probably is changing her care. Although just on that point, didn't she say they had a metabolic arrangement? At the time. Yeah, so I'm not so sure about that conclusion, but anyway. Well, you know, at least watch. Let's watch whatever. Yeah. What I just wanted to say was I, you know, David's point is a really, really good one. And at the end of the day, it's about outcomes, and it might be worth, and it's really true, and it's something I hadn't really picked up and going over that, it might be worth mentioning that in the RFA, right? That if people have facile ways of beginning to look at outcomes, that's, I mean, because that is really the only thing that matters in the end. Right, and I would say that in the Emerge PGX program, our RESP has looked at, you know, process outcomes as well as, you know, sort of health outcomes or clinical behavior outcomes, those kinds of things. And so those are definable and can be included. Other questions or comments? I mean, I think we've had a good, healthy discussion about this concept. So I think it's time for a vote. Can I have a show of hands of those who are approving the concept? If you would like to suggest a caveat, I would, well, I had started us down that path, but I'm not sure that I captured it sufficiently. I was saying that the RFA would be more open to, as this concept is stated, the idea of whole exome or whole genome sequencing isn't there. But we would... No, I'm sorry, it is. I mean, it says targeted exome or genome. Okay. You know, and that would be decided, you know, closer in time based on a variety of factors. Okay. So maybe an amendment is not necessary then. It says whether targeted exome or genome. So that's saying that targeted exome and exome. Correct. I'm sorry, targeted exome comma. Exome comma, you know, parenthesis. Oh yeah, that's right, yeah. She eats shoots and leaves, yeah. We're not voting on that today, so. All right, so perhaps no amendment is necessary then it is built in. Yeah. Okay, make sure the commas are there. We got that. So let's try the vote again. Those in favor? Any opposed? So noted. Thank you. Oh, I'm sorry, any abstaining? Thank you, Jill. And thank you, Terry. Sure. All right, Jim, we're gonna try to sneak you in here.