 I could just ask for some comments first and then we could ask you all to identify what priorities would you see there being in both the epidemiology and genomics fields for future research. So who would like to go first? Go ahead. Go ahead. Well, I'd like to continue a discussion. I started with people in the audience earlier. Many geneticists are planning studies that will use opportunistic controls. So basically, you know, given X number of dollars, you could do a case control study of a certain size or you could double your case sample and pick controls off a website. And I think epidemiologists have a lot to offer to geneticists planning those kinds of things and I'd love to see more discussion of that, more written that's going to be in genetics journals about these kinds of strategies. And I would just curious, you know, sort of to get a take from the audience on those kinds of approaches and when they're beneficial and when they're just dead wrong. So it may be a little late in the day, Mike is going to rise to the challenge. Thanks. Well, just a quick comment as one of the few epidemiologists on the AMD, the Klein AMD paper, which you just nicely showed. I think part of the reason that was a successful paper was that the control group, well, first of all, the phenotype was highly specified, but the control group was also very carefully defined as well, taking advantage of the fact, of course, it all came out of a randomized trial database. So I do think there is a lot to be to be gained, as it were, by the by having by thinking through controls, if you don't have access to these 3000 common controls, which certainly seem to work in the recent papers, but if you're sort of starting a new a new project and thinking about defining control groups more carefully, as well as defining phenotypes more carefully, I think that that can have a huge impact on on power in these studies. Can I just argue against that? So so the AMD association is so strong that I don't think it was seen at the level it was in that first study entirely based on a very well curated set of cases and appropriately matched controls, particularly not with that sample size. So I just want to propose that that in fact actually for many of the phenotypes where we now have hits, the genetic association and strength of association is actually going to tell us a lot about the quality of the phenotype and that actually the genetic association if it's seen at the same level as the moving average meta-analysis may be more informative about the quality of the phenotype than what we used to think was informative, which was, you know, was it measured in a clinic by a specialist, et cetera, et cetera. And so this this might be of enormous value in epidemiology in terms of really distinguishing what it takes to phenotype certain outcomes and particularly diseases, whether you really need so-called gold standard clinic based invasive testing based phenotypes or whether lesser degrees of phenotyping or phenotypes are considered as lesser degrees physician diagnosis or, you know, not having all of the tests, you know, may be appropriate. And for some phenotypes, this all shows that you really do need, there is a kind of hierarchy of phenotypes it seems to work okay. So in the diabetes world, for instance, if you look across the quality of the phenotypes in the studies so far, you know, most studies are getting most hits and the case groups have been assembled based on everything from registry-based information all the way to OGGT administered in a single institution. Yeah, well, because I would argue that one of the reasons the audit ratio was so high in the AMD study was that the phenotype was so well characterized. And I don't think things like doctor's diagnosis is actually a very good way of defining a phenotype or severe disease. I mean, severe asthma still is a very heterogeneous mix of different diseases, even though it's severe and doctor diagnosed. So I think we need to go into the phenotype question from a biological perspective rather than a clinical one if I could make that distinction. But I don't think, I actually don't think we're disagreeing. No, it doesn't sound like it. And in fact, the idea that we might be able to use genetic associations to, if not refined phenotypes, at least identify the phenotypes that are more genetic-y might be, you know, a nifty thing. And it's kind of a novel way of thinking about them. But you're right. I mean, epidemiology is all about methods. And it seems that, you know, the geneticists have just sort of discovered phenotypes just recently. And it's a challenge, I think, to be able to address. Can I follow up on that? Please, and then we'll have Boston. So there are two groups here that are phenotyped, right? You have the phenotyping of the cases. And then you have the phenotyping of the controls. And as geneticists, there's somebody who wants to find genes, right? What I want, or in theory what I want potentially, is a group that's a very cleanly controlled group. So in diabetes in the fusion study, we didn't just take everybody who was non-diabetic. We took people who had normal glucose tolerance, who truly had much less risk of developing diabetes. So that, in theory, is going to increase our odds ratios, because we've cut out the people in the middle, right? So it might look like our phenotyping was better when, in fact, we have a different set of controls. It also means that if we go to combine data across studies, we really don't have a good population-based estimate of the risk of getting type 2 diabetes. What we have is a study constructed to get the maximal risk. And I think that's a place, as you had said, Terry, where epidemiology can really contribute, because a lot of the studies that have been designed to be genetic studies have been designed for just that. And there's a very big place for folks who have good population-based studies and who have knowledge on, if you don't have a population-based study, how to get a population-based estimate will be very useful in terms of what is actually the effect of the disease. Can I? That's confusing to two issues, because even in a very well-designed case control study, you're really not getting proper population-based estimates. Your triplicable fractions are always overestimated, because you're actually using odds ratios rather than relative risks. But so I think we need to be careful not to mix apples and oranges here. Sure. And one way to answer many of these questions would be a database way, where you could actually look at complement factor H polymorphisms in the whole spectrum of basically a population-based sample and ask the question, if you pulled out, or if you just compared to a cohort and a case cohort design, would you see the same kinds of associations? Given how strong this is, I bet you probably would. But you might not have picked it up with 100 cases and 50 controls. Vasant. This is probably a futuristic comment largely, because what we have heard today is about associations. And as cardiologists and epidemiologists, we have learned the hard way that the highest hierarchy of epidemiological evidence is a randomized controlled clinical trial. So where this field is headed is you have to demonstrate using a randomized controlled clinical trial that a set of biomarkers or genes, a strategy or management based on using a set of biomarkers or genes, is superior to either conventional management or to, you know, we already have what I would call a million SNP scan. It's a simple question called, do you have family history of cardiovascular disease? So you ultimately end up with a situation where you have very many small genes and modest effects. So it's a great opportunity to learn, to understand the biology, to identify molecular targets. At the end of the day downstream, you'd have to have evidence from a randomized controlled clinical trial that if you show an association, that actually a strategy based on treating that association is superior to conventional strategy. So can I take the contrary view to that? Please do. So we've discussed this a lot as it relates to coronary artery calcium. And there's clearly strong opinions in both directions. But I would make the contrary argument that if we can show that the evidence is very strong that you can identify people that are greater risk in this example for coronary artery disease, we have proven interventions that reduce risk. We all know that statins reduce risk, aspirin and selected people, clearly lifestyle and modification. And that identifying people at greater risk we can implement proven strategies and doing randomized controlled trials, sometimes it's not gonna be feasible because you may not be able to randomize people due to ethical reasons. And the costs are just absolutely prohibitive and the amount of time it takes to get the results. By the time you get the results, the technology is gonna be out of date. So I would say that I think we need to be careful since we're sort of on the record of what we're saying here that that's an absolute, it's something that still needs to be figured out and it's clearly not that there's only one answer to that question. I actually agree with that. I think the truth lies somewhere in between. I was just taking your position which was on one end of the spectrum. The realities have learned time and again that for example asymptomatic arrhythmias, it's intuitive that they are bad but intervention is worse. So time and again epidemiology has taught us that often our first thing is first do no harm. And while we can predict disease that does not necessarily mean the prediction tools end up in better management. One hopes to do so but probably I think it's worthwhile to have an element of caution in terms of all these associations which are being described. Hopefully they'll result in personalized medicine but that has to stand up to the scientific evidence based on randomized trials at least in some ways as feasible. So one thing we haven't really discussed in this forum much is translation and I was gonna bring that up when you called on me if you did. So I think it would be helpful for us to also talk about, now that we're getting this wave of data what we're gonna do with all these data. We talked about whether we're gonna give it back to individual research subjects but what are we gonna do now as a clinician scientist to try to translate this work into clinical practice. We have funding from the Reynolds Foundation and their strong mandate is for us to do research that's gonna be easily translatable in just a few years and we kind of all went whoa, how are we gonna do that? And I think it's our challenge to try to figure out is one to get the data out to the scientific community as rapidly as possible but I think it's also our obligation to get the information in a way that it can help our patients as quickly as possible and maybe we should have another session to discuss how to do that. Now that's a good point. Next question. On the same lines I suppose. In humans, in any organisms as a matter of fact, these are hierarchically organized. There are individuals, there are families, there are populations and then we know very well disease is individual based and then there are in families but unfortunately all the GWAS studies and all the association studies and most epidemiological studies are concerned with population average, taking average and then. So now the question is with all the GWAS studies and all the insight that we have obtained so long, how can we glean into say families and individuals? How can we decouple these things from the information that we have obtained so far? And I'm sorry, how can we decouple? Yeah, for individual level. For example, say in medicine, individual is the unit of treatment. Similarly, when we look at from population genetics perspective, individual is the unit of selection, meaning individual is the one who gets the disease and he probably, he or she dies. So from all of this enormous information that's available, how can we boil down to take care of one individual and two families because that is where the disease is, the individual and family because diseases run in families. Any quantitative character that is meaningful runs in families. Right, well I think that was what Wendy was alluding to that the translation of this is going to be very difficult and is probably years worth of discussion and work in itself. I think though what you may be alluding to is the fact that genome-wide association now allows us to look at individuals where it wasn't so easy to do that in the past. Although we were doing it in family-based studies, identifying genes and then looking at those in individuals and unfortunately those didn't hold up quite as much as we might have expected. Maybe Nancy or Laura or Wendy would want to come. I was just, so I think he's more addressing the point that we get these averages across populations and then how do we apply that in individuals for something like personalized medicine. But I would argue that we've been doing that for years. If you think more of the risk factor paradigms from cardiovascular disease, we learn about the risk of high cholesterol in the Framingham Heart Study and translate that to recommendations we make to individuals. And I think doctors have internalized those kinds of models pretty well. The population at large understands cholesterol, for example triglycerides, understands pretty well. If LDL's too high or HDL's too low, triglycerides are too high. It's a risk factor. They try to modify lifestyle diet, maybe take medications to lower those. And I think the translation of the genetic risk factors that are identified in these kinds of studies. Remember, so the outlier so far is really TCF7L2 with a 1.4. All the rest of them, we're talking about mostly 1.2 and less. So these are modest risk factors that collectively will improve our ability to identify risk and characterize it for subjects and maybe help classify people who are gonna respond differently to medications or so forth. But I think it's gonna be very much within the paradigms that have already been set up. I think. Yeah, I would agree with Nancy. As she said, the TCF7L2 is really the, looks like the tip top of the. For diabetes. Association in diabetes. 1.4, I mean that. And if you put together the kind of top nine or 10 genes that appeared to be replicated and associated with diabetes and you try to look at your ability to predict people who are going to get diabetes, you can get some stratification. But in fact, if you stick BMI into the model and you stick the genetic risk factors in at the same time, you really haven't changed how well you can predict because the genetic factors that we've identified, at least for diabetes, don't account for very much of the risk. And so I think that the interesting things are going to come in much larger scale studies and starting to look for interactions and starting to look for combinations of genes. And at that point, perhaps you'll get to a place where you can actually use these in prediction. But I think in the state that they're in right now, as marginal effects, they're really not of much use to clinicians. Yeah, I'm talking about what is the amount of variance that is explained by all of these genetic factors. Take the best scenario or give me the amount of genetic variance explained by many of these so-called risk factors in any disease. It's not even 10% or 15% or 20%. Oh no, it's not even 2%. No, it's much less. No, when you put all of these things together. No, that's right. So when it is 2%, you know, consider it. That's right. But it may never, at the last be true that any one of them could be quite informative in therapeutic decisions, for example. 2% with 2%. Look at the oversight. Because it may be a different kind of specificity. I mean, I think those clinical trials have yet to be done. And also on the other hand, what is the, when you look at the genetic effects, so you said marginal effects and then interaction, so far, you know, one sort of, you know, genetic effects, is it additive or dominant or epistatic or additive, additive, additive, dominant, additive, epistatic, what is the most overriding genetic effect that you, that so far, people have explained, have discovered. So when we looked at the data in our study, almost all of the effects that we saw were consistent with a multiplicative model. I think there are a couple that looked otherwise. Certainly if you got bigger sample sizes, you might then. Meaning, I would try to. I'm sorry. We need to allow other questioners to explain things. Let's let Laura finish her comments. Were you finished, Laura? Sure. So most of them, when we looked for interact, so the loci themselves looked like they had multiplicative effects, most of them, not all. When we looked for interactions, there were very few interactions that we could detect statistically. So I don't even know whether I would hazard a guess as to what those interactions looked like, if I believe them. Great. Thank you. Betsy. My comment is from an epidemiologic perspective here. So I'm listening and I'm thinking, well, genome-wide associations, we're discovering associations with SNPs. So we actually don't know gene, gene function, anything like that at all. And that soon, I mean, there's gonna be this great wave and the wave is gonna crash. And then what's gonna happen is what we're gonna be able to do next. And so I think that what's coming next is really able to use these genome-wide associations to improve our epidemiologic study designs to go after candidate genes in environmental interactions based on what we can do in the future. We're talking, everybody's talked about very, very small effects and for small effects, you need to use sample size. From an epidemiologic perspective, that means we're pretty far away on the causal pathway from the outcome. That's the way we think. The closer you are on the causal pathway, the stronger the effect is gonna be. So that in fact, if we really knew what that gene was and we knew what its function was in an individual, if it was doing what it was supposed to do, the probability in that individual would be close to one. So it would be a very large effect. So it seems to me that that's the promise of genome-wide associations is sort of giving us sort of the cartography of our genetics so that we know where the peaks and the valleys are so that we can better design studies. We won't need enormous studies, we will need well-designed studies and that take into account these things that we know just like now, you go for high-risk populations to try to, because you have a high probability of an outcome. Well, if you knew what that high-risk group was, then you could take them, they would be homogeneous with respect to these groups of genes and then you could look for the environmental effects if you can get there. So that's sort of my take and I appreciate the correction if I'm wrong. Well, I might like to comment and then perhaps David or others would as well. I think while we recognize that genes of small effect are sobering to say the least, that how much can they explain on a population basis? Let's not forget that we've found many things that are really quite rare that have had tremendous therapeutic implications, one of the best examples being the LDL receptor variants which were found in a very small proportion of the population. I think you misunderstood me, that's not what I was saying. I was saying that we are detecting genome-wide associations SNPs. So in fact, we don't know the function of those individual genes, which means that you basically have a marker for a causal pathway and so that you have a marker and if we could get closer to what that really was, then we would have a much stronger effect. Well, there are a couple of reasons you could have a small effect. One is that you're far away from it and there are lots of other things contributing and another is that it only acts in a small proportion of your population. That's true, but so my point was, is if we knew that we could use genome-wide association stuff to enrich our samples to address the second one or to post-wave to get closer to what the effects were, the genes were, so either way we could increase the power to detect an effect if it really is there. Martin, would you like to comment? Yeah, let me comment on that as well. I think I understand what you're saying, maybe I do, and I think it's actually quite similar to a point that Bob Hoover made in his remarks and that is that really what we're doing with genome-wide association studies is we're looking for candidate genes. That's the purpose. You just said you pick as opposite a control group as you can find because you wanna find the genes. So the purpose of the genome-wide association study is to find the genes. To measure the contribution of particular variants within particular genes, within particular populations, interacting with particular environmental factors, it takes a different kind of study. In fact, it takes a whole series, there's a whole research area there that's between, and that's really what I meant to say, although I didn't say it very well, by showing that slide from the DOE, gene chips show susceptibility. No, the gene chip doesn't show anything except the gene chip. There's a whole body of research that extends from that finding to something that can be used in public health. So I think that it's a wonderful tool. It's fantastic just by showing those little histograms that describe the natural history of association studies in those two conditions, Crohn's disease and macular degeneration. I mean, these genome-wide association studies have triggered a whole flourishing of additional studies to zero in on the potential pathogenesis, the pathways, the intervening environmental factors, and so forth, and that's really where their value lies. And I wanna get back to Nancy's original comment also, her question, which was what about the control populations? Well, if it depends on the purpose of the study, what is the research objective? If the research objective is to find SNPs, find genes, then you want the as far apart as you can get, which was what Laura was saying. If your research objective is to measure the population attributable risk of something, then you need a population-based study. And actually both the Crohn's disease and the macular degeneration examples are interesting not only because they've been replicated numerous times in genetic association studies, but because they appear to have substantial population attributable risk with the macular degeneration. I mean, it's something like, I think in this study that looked at E-Mind smoking, it was about 70%. So that's a lot. So I think everybody wants to jump straight from the gene chip right to the bedside. You cannot do that without the necessary intervening research, and a lot of that makes use of epi-methods. I have one quick question for Laura. Back to this control issue. Fusion was one of the first large-scale studies with the Illumina product. You were gonna have to have controls. So yes, if you have to type controls, you want to get the best that you have. But if you knew that there was a bank of people from Finland, cord bloods from Finland, so largely matched to your set, and you could do twice as many cases and draw from that free set, or the number of cases you had, and your well-phenotyped controls, what do you think would have been better? I mean, so this is, you know. Right, so this epi-101. This is the question of ultimately trying to figure out, from a genetic point of view, how many of the controls in that random set would have become diabetic? And then using that information, I guess my intuition is if it was twice as large, I probably would have gone for the twice as large sample. There's some point at which I wouldn't have, and I don't know exactly where that is, but yes. Maybe in the last 10 minutes, we'll let you ask your question. We may finish about 4.30 or so, and we'll ask each of the speakers maybe starting, and with short responses from Marta, and then just moving down, what they see as being the most important major step we could take in bringing genome-wide association into epidemiology and really clarifying which associations are real and are important for translation. So while they're thinking about that, your comments, sir. Short question to Jim Ostrell. Jim, whether there is such a function, or will be such function to look for specific SNP, let's say in D. Bigup, let's say in Framingham, with several phenotypes, to see whether a SNP is associated with several phenotypes, not with one, but with some subset of phenotypes. And the larger question to the panel, whether this knowledge will allow us to provide epidemiology with some new clever uses of wasps such as redefinition of phenotypes, whether or fighting some new associations between phenotypes that we would not expect, or to redefine some more biological phenotype. Yes. It depends on how many associations are deposited because that's obviously the key between the SNP and the phenotype. Others want to comment on that? Martha, maybe you could make a suggestion as to what you'd see would be the most important thing that we could do, might it be synthesizing knowledge, meta-analyses, sharing data, whatever, in order to use epidemiology to really bring forward the identification of genes for health and disease. Well, I think I already had my chance, but I'll take it again. And that is that the value of collaboration has been the theme of everyone's comments, but there's a cost to collaboration as well, and that's also been discussed in some depth. And I think that funding agencies, in particular the federal government, which I'm also a representative of, I think has a unique responsibility because I think we're the ones who can actually provide the support for activities that often are not reimbursed very well, like, for example, meta-analysis. People don't get grants to do a meta-analysis. But also to build the infrastructure that's needed to support collaborations, because especially when you're talking about a very complex field like this that's really technology-intensive, their collaboration does take resources, the money to bring people together, the money to have consensus conferences, and so forth. So that's government's role. Great, thank you. Okay, Laura. So I think what epidemiologists bring to the table, perhaps as opposed to at least some of the people who do genetics, is a real interest in health and figuring out what are the risk factors that make people well or sick for particular diseases of which they have a pretty good subject matter knowledge. And so I think where epidemiologists are gonna be able to play a very strong role is in investigating and being just curious about what are the interacting factors, what happens when you have multiple SNPs, how do multiple SNPs taken from all different diseases inflect, affect overall mortality rates? How does all this combined knowledge really play into what people die of? Great, thank you. David. So I think as epidemiologists we sort of started with an interest in health disease and studies. And if we conclude on the basis of the story so far that we have been quite successful utilizing prevalent epidemiologic resources for the major causes of morbidity and mortality, we should also recognize that there's plenty of diseases and phenotypes that are important that actually we don't have adequate size sample sets for. So in cancer, once you get much beyond about the fifth or sixth most incident cancer, the entire world's database, if everybody collaborated and everybody shared openly, there might only be four or 5,000 cases collected and ready to go, which would barely support the sort of analyses that we're talking about. And once you get beyond heart disease into subtypes, once you get beyond rheumatoid arthritis into scleroderma, we actually are very underinvested in putting these sample sets together. So I think we've really gotta show the strengths and limits of the technology and then really circle back to the less common diseases and work out how we can collaboratively address these using these new technologies. We also, I guess, waved hands today at how this works, particularly for people of African ancestry. And again, when you look around, what sample sets are available once you get beyond prostate cancer and maybe a couple of the most common cancers, we don't have enough samples. So how are we gonna address that in a meaningful manner to assess both generalizability of results derived from European population, but also discover if there are private polymorphisms in other ethnicities that are important for those other ethnicities? Thank you, Wendy. So I'm thinking about a practical aspect which is funding. So when someone comes to you and says, well, we found this result and we'd like to collaborate with you and see if you see it in your study. Okay, well, I'm gonna write an R01 and that'll take me, as long as it takes me to write an R01, submit it, get it rejected the first time, submit it again, fund it, and that, I mean, I don't have to explain to the audience about that. So that's clearly not feasible. So usually we have some funds we can try to use, but that's really not a very practical way to do science. And so it seems like there should be some mechanism whereby you can apply for grants to have available money to do these replications, validations, when they come up without having to say in the grant what it is exactly you're gonna do. So I'm thinking of that as sort of a maybe practical thing that NIH might be able to help us with. Great, I have to say money too. I was appalled to learn recently that most of the clinical trials being run now, they don't even collect blood samples. And there's a big pushback when, you know, geneticist goes to them and said, oh please, please, please, just collect blood samples. You don't have to extract DNA, you know, we'll take care of everything else, just get the blood samples. There's a cost to that. And they've got no extra money. And if they just collect the blood samples, they'll have to reduce the size of their clinical trial, which they can't do because they have to be powered to obtain a certain level effect size. So clearly to do this in any way, there has to be more investment in some sample collections. And I appreciate the pie is not growing, but it ought to given where we are with the, you know, with the genomics infrastructure. If we want to take advantage of all of the information that we've spent all these years investing in, more money has to be put into it, especially for sample collections that are already occurring. I mean, these data are being collected. It's a trivial additional investment to collect the blood samples that will enable all of these downstream things to be done. It's criminal not to do it. I agree. Good job. I have to really support that. I was going to say something else, but that's certainly the most important thing is that if you don't have the sample, just forget the rest of it and we don't need to talk any further. But if you do have the sample and you get the data, I guess the one thing that I would comment on is as a positive side, it's not, there's upsides to sharing data too. And the one thing I was going to say was over the 20 years that I've been at NCBI and sort of taking different data sets from different communities as they have set up resources like the genome resources, even with excellent groups, almost 100% of the time that we transformed the data from the way the PI had it to the way we were going to store it, we found problems with it. And we were able to work sort of quietly with the PI and sort it out. And it's a simple effect of the more eyes, look at something the more different ways, the more things you see. And so it's actually benefited everyone to do that. Great. Yeah, so having more than one set of eyes is really, really very important and I think it's been very useful in many data sets. I think I would obviously agree with everything that's been said here. How we make this happen is sometimes a little more challenging than coming up with them, but we have to come up with the good ideas before. We can actually move forward. I think if I would suggest one thing, it might be finding some way of identifying, we heard earlier that there are many, many efforts going on in the sequencing area. And sequencing costs a lot of money still, even though we are moving toward $1,000 to sequence an entire person's genome, but we're not there yet. And yet there are lots of groups that are going into specific areas of the genome, candidate regions, et cetera, sequencing them in various kinds of individuals with such and such disease or not. And probably many groups sequencing the same areas over and over again. And there must be some way that we could identify what those efforts are and annotate somewhere. We could have a database of sequence efforts or something that basically said, yes, this group did 100 people with these characteristics. Because when you look at anyone, almost any one of these genome-wide association studies that many of them will say, oh, and then we went back to the lab and we sequenced these people over this region and we found these 14 additional SNPs. And so to try to keep others from overlapping that and then we might have the money to do the blood collection that Nancy's referring to and the support for infrastructure that others have commented on and the support for young investigators. So that would be my suggestion.