 So, okay, if by the end of the day we convinced you that in order to do responsible GWAS studies, you need very large numbers and you need very large numbers of studies, both to have the power to detect real differences and then the replication effort to sort out the false positives and the real positives. So then the dilemma becomes, well, how do you do this, since no one individual and very few institutions could come near to assembling the resources that are needed to do this. The obvious implication is that you need to establish large collaborative consortial networks to accomplish this task. And it's another dilemma because the people, the studies that you would most like to incorporate are led by people who oftentimes have been your most intense competitors. And so the challenge is how do you convince your own colleagues as well as your competitors of the value of forming these collaborative and consortial arrangements. I'm going to talk briefly about six different ways of doing this. I've ordered them in my order of descending priority and you can permute many of them, but you can't permute the top one, that's why it's in CAPS. The major reason for forming these consortia, the major reason that I know of that the people who have consented to form and join these is because they have the recognition that it's the best and likely the only way for meeting the requirements for doing good science. And I haven't met an investigator yet who isn't interested in doing good science. The rest, the extent of collaboration, leadership roles, resource enhancement, training opportunities and institutional incentives are all icings on the cake once you make the case for meeting the requirements of doing good science. So what is good science? My bias is up here, but that's doing very large studies. It's doing replication, replication, replication, all planned and coordinated. It's conducting these with rigorous high quality design, conduct and analysis, genomically, epidemiologically, statistically, and in an informatic sense. And this really can only be accomplished through consortia or collaborative means. So how do you convince your colleagues and your competitors that this is true? Fortunately, it's getting a lot easier as time goes on to convince people of this both because of practical experience and theoretical considerations. In terms of practical experience, those of us who have engaged in doing genetic association studies for candidate genes for the last 15 to 20 years, when we're honest with ourselves. I think we have to rate our performance as dismal. This was a summary done by Joel Hirschhorn several years ago, but it's basically true today as well, that for the most part, we have interrogated a very small portion of the human genome with respect to any particular disease entity. Often the so-called hits that have been reported haven't been followed up at all. And the ones that are followed up responsibly, the validation rate, the replication rate, is pitifully small, ending up with very few valid variants identified through this process and a tsunami of false positives. So and this was actually, this is just for main effects. When you move to the area that everybody's so enthusiastic about, gene environment interaction, gene-gene interaction, pathway sorts of things, the performance goes from dismal to cataclysmic. Then if this was, and if this is what we do, what we've been able to do with a priority chosen genes on biologic and other grounds and the numbers that we look at, it's sort of predictable that when you move to the genome-wide association studies, and it got agnostic sense in 100,000, 500,000 or a million, that things, if we didn't change the way we conduct in business, we could be in big trouble. This is basically confirmed by one of the earliest studies in Parkinson's disease, two-wise studies, a relatively small phase one of 198,000 SNPs and a smaller phase two with 1,800 of those SNPs brought forward. This investigator found 11 SNPs that were associated with a P less than .01 in both tier one and tier two, about a year later. Four different investigative teams investigated those 11 SNPs and found none of them to replicate. So I think the practical experience is very telling. It's backed up by theory as well. As Debbie mentioned this morning, if you're really interested in interactions, you don't need hundreds. You need at least thousands, if not tens of thousands. This is just showing you the kind of sample size requirements to have 80% power to detect a very strong interaction, at least in cancer, a two-fold interaction, depending on the prevalence of the exposure and the sensitivity of your ability to measure that exposure. And it's basically in the thousands just to have that sort of power. Once you set an alpha level to achieve that kind of power, then you're confronted with the other thing that was discussed by Wendy and others is the real tsunami of false positives. I like this particular way of illustrating it, at least for an epidemiologist. I think it's very graphic. This is an example of a minor, a variant with a minor allele frequency of 10% and an odds ratio of 1.4, and you'd like to have decent power in a study of 1200 cases, 1200 controls, and passing for 500,000 SNPs. That line shows you that to achieve 75% to 80% power, the corollary to that is you're going to be bringing forward somewhere between 20,000 and 30,000 false positives. So if you believe that for any one disease, there may be somewhere between 10,000 and 30,000 or 40,000 variants that do have main effects that you want to find, you're confronted with the daunting task of identifying those 10 to 40 out of the 25,010 or 25,040 and massive replication. I think it's been demonstrated to be the only respectable way to do this. I won't go into quality right here. I have a slightly different view than David's, although I think we're about 80%. But we can come back to that if you want later. Okay, so let's say you've made your case, you've convinced your colleagues and your competitors and others that this is absolutely necessary to do good science. You're 80% to 90% of the way home. The rest of these, I think, are the ones that hopefully would seal the deal in that last 10 to 20% of incentive. One is the area of extent of collaboration. And here you have a lot of different examples of a lot of different ways of doing this. And the fact that there are so many different ways of doing this, I think, is the key here that you can actually appeal to a whole different series or sets of investigators, depending on what their particular wishes and desires are. These collaborations can be either ad hoc or ongoing. Perhaps one of the best ad hoc examples I know of is the Breast Cancer Association consortium and you saw some data from David on that. This has about 20 studies with about 30,000 cases and 30,000 controls of breast cancer. It was set up on the cheap and on the simple side first to assess the a priori genes for breast cancer and this was published this past spring in Nature Genetics. Basically they took the 20 best candidate snips from literature and from other places, vetted them in the consortium and you see basically two survived as another commentary on our deleted, pick a priori, good subjects. This same group was used to vet the first phase of the GWAS study done by the Cambridge group. David talked about that, the one that discovered five novel areas that have everybody quite excited at this particular point. The other extreme of level of, is the ongoing level of a level commitment is the ongoing ones. This good example of this is the cohort consortium established by NCI about eight years ago now. It includes the major cohorts that are focused on cancer and has assembled them to do a lot of things together. The idea is that it's not just for a one-time study, but to encourage interactive relationships between these on a number of cases. There are three major genomic projects being done by the consortium. This is a candidate gene approach to breast and prostate cancer. This is a GWAS study of breast and prostate cancer. This is a GWAS study of pancreatic cancer. David also showed you the results, so I won't go over this, some of the replication, discovery and replication in prostate cancer. Another aspect of extent of collaboration is how, if it's limited or more, limited being you can just do 30 SNPs to check the 30 best candidates from the GWAS study and publish them with the consortium and then take your ball and go home. Or if you want, you can talk about a more meaningful or deeper level of collaboration. You obviously want to follow up, as been mentioned many times, you want to follow up the hits with fine mapping. You really want to get into, if you're an epidemiologist, certainly you want to get into gene environment candidate pathways, talked about this morning as well. Well clearly to do any of these, you really need, the need for replication is even more important than it is for the main effects. You want to keep the collaboration together. It also provides way more opportunity to share leadership on all sorts of ways of discovery and validation. And obviously other phenotypes and outcomes, everybody who's doing GWAS studies now wants to take advantage of the fact that they can do GWAS studies of obesity in their control group or serum levels of hormones or you name it. And again the need to replicate and if you have the same consortium pulled together, you have all sorts of opportunities to expand the collaboration. And these I'm talking mainly about more epidemiologic or populations sort of collaborations. Most of you live in institutions that have good laboratory scientists, many of whom work in the genomic area, many of whom would be thrilled to follow up on these findings. And that's another aspect that can be woven into these collaborative ventures that as you know the kind of initial scan and replication is not the end of the story, it's actually the start of the story. You need sequencing, genotype, you need functional studies, translational studies. And a good example of this is the chromosome 8, Q-arm of chromosome 8, AQ24. This region, several locations in this relatively small chunk of DNA have shown up as being very important in prostate cancer and three different GWAS investigations. It's also shown up recently as an important area in colon cancer and breast cancer as well. This is, as was discussed this morning, a gene desert. There's no genes in here, the McJean is down here somewhere. So there's all sorts of enthusiasm among the biologists to figure out what's going on. In fact, each one of the three consortia that have done the discovery work is now engaged in some aspect or another of the biologic follow up of these observations. So it's another opportunity to expand the collaborative venture. Finally, I've talked only about genetics. Once you put these collaborations together, there's other things that can be done. You can go back to standard old observational epidemiology. This is the interlymph consortium, which was developed mainly to provide an opportunity for replication of genetic findings, but didn't take long to realize that they could look at things like alcohol and the standard sort of pooled analysis approach and indeed found a protective effect that was primarily in the diffuse form. There's actually no one study in the world that could have looked at this and looking at subtypes and an exciting finding that needs biologic follow up as well. So now we're leading an extent of collaboration and making it into leadership roles. There's opportunity for virtually everybody participating in these collaborative ventures to have a leadership role. It's not simply that, well, send me your DNA or I'll send you my reagents and you send me the answers and then I'll send you the manuscript when it's done. And there's various forms. There's a shared version with these large groups that publish and where there is a significant contribution by everybody. There's complementary where you have statisticians, genomicists, epidemiologists, informaticians leading their own areas and contributing to the total. There's the count of a portion to approach where you take a slice, a subject matter slice out and give it to one of these interdisciplinary groups to focus on. And this leadership involves both study conduct as well as analysis and publication. This is the breast and prostate cancer consortium that David Hunter talked about. Human Committee is an example of the shared leadership where every cohort has an equal voice and the population genesis, the genotypers, the statisticians, the informaticians and others all lead practical aspects of the project. When you get to the area of analysis and publication, again, I think some of that's been discussed already. And certainly just an example from the Interim Consortium in the last two and a half years they've published 17 articles with 13 different first authors and 14 different last authors. So there's plenty of opportunity to share the wealth. So beyond leadership, there's resource enhancement that is adding to the infrastructure of the contributing studies. You have to do DNA extraction to participate. If you do DNA extraction, it's there for you to use for your own purposes in perpetuity. You may need to get additional specimens or data or assay for additional biomarkers to be part of the consortium and that usually can be grafted on as part of the cost of the consortium. That enhances the value of your own study for many things outside of the context of what you do in the consortium. And of course, additional genotyping, there wasn't too much talk today about the practical way that these things are done, but actually it's mainly done. You know how much you can, you know how much is, how many genotypes are provided on the platforms, that your favorite platform, and you can decide to segregate a piece of that for things that aren't actually strictly coming from the GWAS study, but there are things that you're participating cohorts would like to see genotypes on their specimens. I didn't put a slide for training here, again that was talked today. The role for junior investigators is there's a lot of concern and fear about it. As it turns out, there is actually a whole lot of opportunity. Again, taking the example from the VPC-3 in the last two years, they've had nine publications, and a lead author in seven of those nine has been at the postdoc or assistant professor level. So I think that is a, this is the last part, the institutional incentives. Investigators Institution needs to be apprised to the value of team science in the areas of promotion and tenure, talked about that this morning as well, and providing infrastructure. Actually it's interesting to me that the physicists who are not noted for their population orientation have figured this out a long time ago. They have consortia that have 5,000 people on their authorship lines and seem to be doing just fine in terms of promotion and tenure. Funding institutions, it would be nice if the funding institutions would catch up by providing support for consortia, the meetings, the conference calls, that sort of thing, targeting funding opportunities, and in particular giving people credit for participating in these consortial efforts by allowing that to contribute to the evaluation of the continuing funding for the base study. And they're coming around, those institutions are coming around a little slower than the rest of us, but they are coming around.