 Okay. Well, Steve, if you're there, we'll segue immediately to you. I am here. Can everybody hear me? Yes, we can. Good. In retrospect, I guess I should have, if we go to the next slide, please. In retrospect, if I had to use a smaller font like Howard did, we could get these all on one slide and look at them at the same time. But they're broken up into two slides. On this first one, I think you should get the sense by now from both Maryland and Debbie's presentations that the consensus of our group was that discovery research should remain a high priority for the future. And working with the phenotyping group, one step is obviously to decide what traits or phenotypes are sort of high priority for the next phase of the network. And then the topic for discussion, and some of this has been occurring in the sort of online chat, is whether or not just go with existing data and work with improving the analytical tools and methodology to use existing data, or whether there should also be some effort into denser data generation, whether it's by next-gen sequencing or exon arrays. As a sort of a non-genomic person pursuing this more from a pharmacoperspective, I guess we could also add in that existing data could also include the longitudinal phenotypes that several people have alluded to, and using this in the context of the contribution to disease progression. This also lends itself at gene environment interactions as well. Also the impact of therapeutic interventions on the trajectory of progression. The second point that came out of our discussions was this whole issue of not throwing the baby out with the bathwater and looking at the importance of rare variants. And again, up for discussion would be the most appropriate platforms, whether it be a genotyping or a sequencing platform to capture them, but also resources that may need to be put into developing appropriate tools to detect their effects. Next slide. So this next point gets at one of Debbie's last points, and that is considering study designs other than a straight GWAS type format for discovery purposes. And for example, the example that she gave looking at extreme discordant phenotypes, at least for continuous variables, and coupling them with your platform of interest, and I put a whole genome sequencing here. And, you know, the potential for this particular approach to be a little bit more efficient in identifying causal variants and especially rare causal variants. And then the last point that our group would propose to the larger group at whole would be again something that was mentioned by both Marilyn and Debbie, and that is looking at other sources of genomic material, RNA or going back into the DNA and looking at relation, for example, for these genomic analysis. And then on the EMR side of things, looking to see how additional data can be captured or parsed to look at environmental factors and comorbidities for gene-by-environment interactions, for example. So those are the four issues that were raised by our group as being something to pose to the rest of the group for their thoughts and comments. And I'll toss it back to the chair. Okay, the floor is open for comments or questions on EMR and genomic discovery. Mark Williams here. A thought occurred to me as Debbie was talking, again, trying to bridge the attention that we have between discovery and implementation. This was in the context of the rare variants. I think one of the issues that we're all going to be dealing with as we receive secondary findings from our genomes, exomes, and high-density chips that we're thinking about clinically returning is the lack of information that we have on the impact clinically of some of these rare variants, even in genes that we know quite well. One of the things that we'll be doing is to try and use our traditional methods of contextualizing that data, using family history and other sorts of things to understand what's the potential impact. To me, that seems to lend itself to the idea that if we did a rare variant focus, we could study how we could use electronic health record mining to try and contextualize rare variant information to add additional information for clinical return and implementation. So that could be a potential study topic for eMERGE III that would bridge, again, this discovery and implementation chasm. This is Dan Rodin. I have to say Rodin now because there's other down on the phone. I agree with Mark, but at a practical level, I think you have to make some attempt to limit the minority frequencies down to which you're willing to go. If you find a rare variant that is one in a million or one in a hundred thousand, it's going to be very, very tough unless you know something about the biology to assign any kind of phenotype to that. And so I think the sweet spot for us is probably minority frequencies around 0.1%, plus variants in disease genes that have been implicated. And as Zach said a couple of hours ago, you find that variants that have been implicated as causes of hypertrophic cardiomyopathy or channelopathy are actually much more common than you give them credit when you start to look across very large populations and we're finding that along the way. So I think that one thought is exactly which rare variants we want to focus on. And I think that the variant of uncertain significance is one in ten thousand or one in a thousand, something that emerges is really, really well suited to attack. To sequence data, it seems to me that we should report all variants even if we only see it once in 100,000 people and put it in a database because other people are going to be putting that data forward and annotating those variants even if we can't determine ourselves alone if they're pathogenic, it's going to be really important going forward. I totally agree with that and that's what we're going to be doing in EmergePGX and as data accumulates worldwide you can start to make some sense of that but I think over the next five years it's a one in a million variant unless there's some biology around it, it's going to be hard to make sense of it. But yeah, I totally agree that we have to figure out a way of archiving this worldwide. So this is Haukon here. So as you know the new platform from Illumina on the X-10 which is currently tailored towards whole genome, I mean it's very likely going to be adapted to exome even though that will probably take some time but an exome could probably be sequenced for about a hundred dollars sort of say a year, year and a half from now. So in the interim a strategy to sort of customize a chip with this rare variant content, particular content that are sort of with potential or putative damaging impact or loss of function variance and so forth and that can actually be typed now extremely cost-efficiently across you know thousands of samples for a relatively sort of low amount of money even though it's going to cost some money. So in the interim that would potentially be a very, very powerful strategy across all the sites because that would open up the rare variant content for all the phenotypes that we have and we don't have that today. So this is John Harley Cincinnati. I just asked a question that when we concentrate on rare variants and we don't have all of our samples genotype we rely on imputation. And as the frequency of the variant drops the accuracy of the imputation is disastrous. And so how do we, you know, we don't, we aren't able to take advantage of our huge numbers because the error introduced by imputation is so big. Is there anybody that has a solution to this problem? You need to sequence. So this is Rex. I'd like to weigh in. I'd really like to endorse the idea of thinking about environmental factors. We've played a little bit around with GIS tools. And I think, you know, one of the things that we could do very well which isn't done very well in most cases is given the longitudinal nature of the people that we're following is to think about some of these environmental factors. I think there's going to be increased opportunities to capture some of these bits of data. Marilyn talked a little bit about environmental protection agency measurements that are being made. So I think to be able to start to tackle gene environmental interactions using GIS approaches and some of these environmental measures is also something that would uniquely be possible in an Emerge 3 for us to take a look at. Well, this is Chris, and while I find that idea elegant, I want to make sure we're somewhat cautious and thoughtful about this. For some populations, and your Chicago population, Rex might be a superb candidate for this. For other populations, they're not always, as we say, population-based, and hence the density of sample cases in any environmental geocode. You run into power problems very quickly with environmental association, particularly when you're treating it as a covariate and a substrate. Chris, this is Marilyn Ritchie. One of the other things that I think folks could think about, and this is something that Marshfield has done in Emerge 2, and that is to use the Phoenix toolkit as a mechanism to collect environmental data. We were awarded a supplement as part of the Phoenix Rising program by NHGRI, and so some of the Phoenix toolkit measures were sent out to the Emerge participants, and we've actually started mining that data, and we're finding really interesting gene environment results for type 2 diabetes and some for cataracts, and we only implemented a few of the Phoenix toolkit measures. That's something that other sites could do either electronically or paper forms. It's something that you could port to an iPad that people could do in clinics. You could put it on the web that people could do through their My Health at Geisinger or Vanderbilt or what have you, and that's another way that even without relying on population-based environmental data, you could collect it on the participants in the biobanks. I certainly agree that would be hugely more efficient and wouldn't suffer the broad association problem that you have with geocoding. I actually think the Phoenix toolkit would be the appropriate choice for collecting that kind, so I agree with that. That might be another agenda item to put on the discussion with large health system providers when they're discussing it with the vendors of health records because the patient portal is going to become a part of the mandated electronic medical records eventually, and as they're building them, it would be nice to have patients uploading various lifestyle things that can be merged with their electronic medical records. One question is, are there other things that the Coordinating Center should be working on in the future? I mean, they did an awful lot of work with data cleaning and then the imputation, but are there other things that would make the dataset more effective for other analyses? That would be a good focus for Emerge 3. So one focus there, this is how Conn, is on the copy of the variation analysis because that's another whole dimension that, you know, focusing there from the rare variance standpoint because most of the data is typed on Illumina that can open up a very fruitful sort of discovery and focus across all the phenotypes, again, from a data mining standpoint. And algorithms can, you know, we have algorithms that can be applied on these data at the individual sites or jointly, and then the whole thing gets sort of meta-analysed together. This is Terry. I did want to ask about the issue of sequencing. When we've approached sort of large-scale sequencers, the question they often ask is, well, how many cases of a given disease do you have? Because they're very interested in looking at, you know, thousands or tens of thousands of cases of disease X, and that has not been something that Emerge has really focused on because we're sort of phenomic, as it were. So how do we address that question other than say, gee, we've got so many wonderful phenotypes, isn't this just as good or better? So, Gail, you? So I think we don't need to have a disease focus. I'd be really excited to sequence the 56 ACM gene. We know what those genes do, but we don't know if the variants in those genes do. And we could look both for variant annotations, pathogenic, and importantly, not pathogenic. Everybody is going to have sequence variants if they're really what they do. And then we could also look for pleotropic effects of those same genes. So there's a discovery possibility there, too. And then there's lots of implementation questions. How do you, you know, many of my health is very concerned about those 56 genes now because of the NG recommendations. So how do you implement that? How do you get decisions before? How do you educate providers? What do patients want to know? You know, I think it really hits all the things that we can do really well. And having those phenotypes that we have so in such depth gives us a unique resource for that kind of annotation. And I think even at the pediatric sites there is really important work. You know, 49, I think out of the 56, have pediatric phenotypes. Plus, the pediatric sites really could look into this idea of mandatory return of adult onset mining. So children, which has been a hugely controversial recommendation, they could really ask their family, what do they want, ask their providers what they want. But I think that that is a space where there's a lot of controversy, a lot of interest for the health system. And we have a really unique capability. And I don't think it's something to do with 16s. I would add a couple more, by the way. You know, I mean, I think that's doable. I agree with Gail. I think that one could add to such a panel, I think, which would be extremely meaningful and something that can be done uniquely in eMERGE, things like a list of the highly penetrant forms of diabetes, highly penetrant monogenic forms of diabetes and others. I think which, you know, it would be very helpful to understand among sort of common complex diseases what forms are diagnosable on a molecular level and to what extent is that, how frequent that is. So to be interested in Steve Leder and Debbie Nickerson's comments on that, let me just ask, both of you had pointed toward non-coding variation. And here we're talking about really focusing on genes, even though there are some non-coding reasons. Obviously, in the entrant. So Steve or Debbie, any thoughts? I think it's great to look broadly at genes, but I think that different platforms have different outcomes in terms of what you look at. I mean, many people are sequencing whole genome, but they end up looking at only the coding and that few percent that are well annotated by encode as being highly functional. But I think broadly, whole genome is an important route to go because you can look at variants that are difficult to look at like indels and CNVs by just sheer capture. Thanks. Steve, what do you think? Well, for me, non-coding for me really points more to regulatory regions as being of interest. But just to understand, I'm coming at this from a pediatric perspective as well in that when we are looking at things in kids, there's so much change that is going on between birth and sort of adulthood that you have to look somewhere besides the coding region of a gene for what's changing as kids grow and develop. And to some extent, we know very little about how this really works in senescing adults as well as we move towards a geriatric population. So for me, the non-coding stuff really, I'm really thinking about important regulatory regions and being able to identify those and characterize them. Steve, this is Dick Winchelbaum. In all of our studies of variation in cancer drug response, the majority of the hits that are functionally important regulate transcription. They're a non-coding region. So this is very brilliant. I think that it's clear that from an economic standpoint, we can't do whole genome sequencing of 50,000 people. But we could look at a smaller number of genes. And since these genes have been implicated in human disease, they're reportable, they're actionable, we can look both at exomes and introns. I think that if we focused on a subset of genes, it would be a parody for looking at the whole genome. I mean, it would be scalable. And I think that there's the 50-some genes here. Maybe all of us have some other favorite genes. If we had 100 genes, it's kind of catchy. Instead of 1000 genomes, we have 100 genes that are looked at across a large number of people. And again, this would be as a parody, what are we going to do when we have a large number of whole genome sequences? This really cuts across all of that. I would point out that's very reminiscent of the decision early on with the ENCODE project to tackle 1%. I mean, I don't know if I like this idea or not. That's a separate issue. Sure. But it is reminiscent that the same rationale went in. We're never going to interpret the whole human genome. So there was a whole process to pick the 1%, which was kind of complicated, but we got there. Everybody said the 1% until you felt comfortable enough to scale to the whole genome. So this would be a similar circumstance. However many genes you pick. I think we could also exercise diversity here. So you'd have a specific set of child care. I think we could do really well on that avenue. So is the idea to do non-coding as well, right, of those genes? Yes. I think if you could find that. Yeah. So if you know the regulatory reasons of it, I'm sure it's going to be a good plan. I was actually going to clarify it better. I mean, you would take a gene and you would just go end to end, maybe X number of bases upstream and downstream and just do the whole segment. As opposed to known, I mean, what Terry was implying was sort of known functional non-coding regions. I'm just trying to stimulate the conversation maybe towards Xome versus targeted panel. So what's the difference there? If you get Xome. What's the difference? These are actually asking very different questions. If you're only going to know Xomes, you're going to make the assumption that that's what you're going to find. I thought the idea was that you have these types of genes that are interest and it's only a non-coding. And you want to get a complete inventory deep in lots of people. That's what I wanted you to say, Eric. I don't know what I'm saying. I was just trying to re-articulate what I thought I heard. But what I also heard was a variance approach, which was you take the X number of genes and you take all the Xons. And the introns. Any regulatory regions you know of. The introns are elsewhere. Somewhere else. I mean, there are some that have regulatory regions that are identified in other chromosomes. And so maybe look at those as they become added in. But it takes two years, Debbie, or more to develop a targeted platform like this? No, I think it's much easier now than it was. I don't think it's, I think we have a lot more experience. And I think the PGRN has great data with the PGXC. They can look at these questions. Debbie, what do you think molecular inversion from the PGXM? You know, I think it's a matter of cost and ease of implementation. I think some can be cheap. But whether they're broadly accurate to many genes is not known. We will give you the card. If we put this into an RSA, I would suggest that we could be agnostic as to the technique and let the people that are putting in proposals discuss how they would do it. There are a number of us that are going to be generating large numbers of exomes and genomes. And so that would also allow them for a methodologic comparison about what's the best way to actually do it. Okay, and we're going to need to wrap up the discussion. Just a very good point that there are these commercial entities that are trying to make these panels of two or three thousand genes. So that if you have a patient with marfan or you feel a patient with hypertrophic, you just order that set and then you can just pick and choose and analyze. Because what's happening is that, you know, we're realizing that there's many variants that may cause, for example, an aneurysm. And so I end up ordering a panel of 15 Canada genes, which is like five thousand dollars. And I may still not get the information because they may just do certain variants or not. So I think, you know, that's another, you know, it's not the whole exome, but it's like what are the thousand hundred or fifteen hundred genes that are most often used in the clinical setting and perhaps go with those. And also it would go back to Debbie's point that if you're using some of these in the clinical setting, you would have a familiar structure to interpret the variants much more efficiently. Okay, so I'd like to thank all of the participants for actually all of our panels this morning for a very rich and thoughtful discussion for future directions and eMERGE. And we're going to, I guess we're down to about a 20-minute break for lunch. Careful because you're not going to get all these people down to the cafeteria. We'll do our best. So we'll plan to start somewhere around half past the hour. The idea of these folks should get their work done. Yeah, everybody out there on the call should go run for lunch and bring it back. On the first floor here you come up.