 Great. So this is the concept clearance for the renewal of the population architecture using genomics and epidemiology, or PAGE program. Back in 2007, NHGRI identified the opportunity to further study variants identified from genome-wide association studies, or GWAS, in existing well-phenotyped population-based studies. The idea was that such a program could complement existing GWAS. For example, where most GWAS were of European descent, PAGE could add a component of broad ethnic diversity. Where most GWAS at the time were of case control and clinic-based design, PAGE could contribute population-based estimates and allele frequencies. And where most GWAS were focused on single phenotypes, PAGE could capitalize on the extensive and high-quality phenotype data already available in existing cohorts. Because the focus was on characterizing selected variants in large numbers of participants, the scope of genotyping was envisioned as approximately 100 variants and 10,000 participants for each of four awards. The scientific goals of phase one of PAGE were as follows. First to characterize the population-based profile of GWAS-defined variants. We refer to this as the epidemiologic architecture or the distribution of allele frequencies and association sizes across different populations. A second goal was to go beyond just simple SNP disease associations and look at gene environment interactions. A third goal was to take advantage of the full range of phenotypes to potentially derive new biological insights. And a fourth goal was to widely disseminate the aggregate and allele frequency data to the scientific community. So four study investigators and one coordinating center were funded. The study investigators included Calico, a consortium of cardiovascular cohorts, EGLE which comprises the Unhanes population and the BioView data repository at Vanderbilt University, the Women's Health Initiative, which is a large randomized controlled trial and observational study of post-menopausal women, and the multi-ethnic cohort, a large study of diet and cancer. Over the first two years, PAGE has genotyped about 120,000 individuals, almost half of whom are of European descent with the remainder being distributed across Hispanic, Latino, African American, Asian Pacific Islander, and American Indian individuals. During the first two years, it was becoming evident in the PAGE data that GWAS SNPs don't always generalize to non-European populations, and that understanding the exceptions is important. In addition, given population-specific LD, linkage-based equilibrium, more than one or a few SNPs are needed to interrogate these regions well. At the same time, a group was working to develop a customized large-scale genotyping chip that would address precisely this problem. The metabachip, as it was called, included fine-mapping SNPs from the 1000 Genome Project for loci related to metabolic traits and cardiovascular risk factors. Using this chip would effectively enable PAGE to take population-specific LD into account and potentially identify independent or secondary signals underlying the original GWAS locus. PAGE pursued a pilot study of the metabachip in approximately 6,000 African Americans mostly using error funding. Data from the metabachip pilot, which I'll show you in a moment, suggested that the population-specific fine-mapping approach would indeed be informative and more comprehensively assessing whether and why these GWAS-identified regions might not generalize to non-European populations. In conjunction with our external scientific panel, or ESP, PAGE decided to refocus genotyping during years three and four to exclusively metabachip genotyping. As you can tell from the graph, all of the participants with metabachip genotyping are of non-European descent, playing to PAGE's key strength. Nearly half are African American, with about 30 percent of hispanically-tino descent and the remainder of Asian and Pacific Islander descent. The cost efficiency of the metabachip enabled PAGE to increase its genotyping capacity by orders of magnitude from hundreds to hundreds of thousands of SNPs. As the time to plan for a potential renewal grew closer, we also sought advice from our ESP. They recommended that PAGE continue to focus on fine-mapping via the metabachip to interrogate regions more comprehensively in African Americans. Additionally, they recommended a year-long extension to focus on productivity and the value-added contributions of the PAGE consortium. Finally, they advised NHGRI to anticipate opportunities to incorporate sequencing into a PAGE-based renewal when the timing and the cost became favorable. At Council last February, Council recommended that we extend PAGE for one year, providing some continuity while the feasibility of a sequencing-based renewal was assessed. The extension year plans will be discussed in a closed session, and of course we're discussing the sequencing-based renewal at the moment. I wanted to note that when Council heard about PAGE's progress last year, there were three PAGE papers published and seven had been submitted. As an update, there are currently 18 published papers, 10 more submitted, and 51 in various stages of completion, so there really has been a sharp increase in productivity, which we hope will carry over into the extension period. The metabachip pilot data contributed substantially to PAGE's understanding of disease-associated loci, and let me show you an example. Here are metabachip data at the PCSK9 locus in relation to HDL cholesterol level. There are six known non-synonymous SNPs that are present in this region, four of which are associated in PAGE African-Americans with HDL at P less than 10 to the minus 5. This isn't shown here, but if you condition on this top SNP, the other three remain significant as well, underscoring the value of having dense genotyping information that tells you much more than a single SNP could. Interestingly, if you look in the frequency of the... If you look in non-African-American populations, you see that five of the six mutations are private to African-Americans with only one being seen in European-Americans. Furthermore, all of these variants, with the exception of one, are rare, underscoring that these variants would not have been seen in GWAS of European ancestry populations. So in addition to dense genotyping data, PAGE is also capitalizing on having dense phenotyping data. This slide shows the phenome-wide association study, or FIWAS approach that PAGE and other groups have taken, where genotypes are analyzed in relationship to a multitude of phenotypes to discover potentially novel relationships. So these are metabasit pilot data in relation to approximately 230 phenotypes. Each dot, and there are roughly 3,000 of them on this diagram, represents a SNP associated with two or more different phenotype categories. So these data obviously need to be replicated and explored further, but this just gives you a snapshot of how using the full breadth of phenotype data could potentially identify pleotropy or uncover new biological relationships. Also I mentioned that a key goal of PAGE is not just to synthesize this large amount of association data, but to disseminate it as well. Recognizing this, the PAGE investigators have done two things. The first is to work closely and extensively with DBGAP to adapt their existing interface, which is very well-suited for GWAS, to add the additional layers of multiple populations, multiple phenotypes, and association sizes, and actually adapting the way that the data are displayed. So they've done that for us. The second is to, the investigators have shortened the publication embargo on this exploratory category of association data from 12 to 6 months. So over the project period, PAGE has made strides in several areas that have been relevant to a renewal. One of PAGE's key strengths is the large numbers of diverse, well-phenotyped populations. So this points to an opportunity to continue to leverage these data sets and push association mapping to its limits. The availability of extensive fine mapping data has suggested an opportunity to simultaneously detect and refine signals, as well as characterize associations in different populations. The availability of comprehensive association data suggests the opportunity to broaden analyses to gene bi-gene, gene bi-environment analyses, as well as those related to pleotropy. And then finally, as we move closer to sequencing, there will be a number of bioethical and psychosocial issues that will be important to consider, including those related to potential return of results in minority populations. The idea of a renewal also will provide an anchor in the biology of disease area of the strategic plan where PAGE is currently NHGRI's largest and most diverse population-based study. So as RESP and Council anticipated, the state of the science certainly has changed a lot since 2007. For example, many of you will of course know that disparities in discovery studies of non-European populations continue to persist, as do the health disparities in disease burdens in these populations. The more we learn from GWAS and other studies, the more we realize that common genetic variations really does explain a limited proportion of heritability of complex disease. This seems to be true whether there are only a few or a couple dozen common variants that have been identified. And increasingly, data are accumulating that rare variation is not only likely to be functionally important, but population-specific as well. So all of these observations suggest that having a resource of phenotyped and sequenced populations of non-European populations would be broadly useful for the scientific community. So as you read in the concept Clarence, the proposed renewal would conduct whole genome sequencing in approximately 2,000 individuals of non-European descent. The scientific goals would be, first, to identify disease-associated regions, taking into account population-specific linkage to sequel labrium. Second, to build a comprehensive population resource of non-European individuals, and by comprehensive, we're talking about publicly available sequence, phenotype, and association level data. And then third, to explore the associations of sequence variation with the broad range of page phenotypes that are available. So assuming the funded grantees have access to study samples that are at least as large, diverse, and while phenotyped is the existing ones, this gives you a rough idea of the sample sizes of potentially sequenceable participants that might be available. So as you can see here, the numbers are well into the tens of thousands, and even approaching 100,000, and that they're fairly well distributed across multiple ethnic groups. Given these larger sample sizes, a key activity of phase two of page will be to work cooperatively across the various sites to coordinate selection of participants, prioritizing those of the highest value for analysis and data dissemination. Other key activities will be to make publicly available individual-level sequencing phenotype data, describe the full set of study participants and associated phenotypes potentially available in the event that additional funds or sequencing capacity might be made available, and then finally to propose methodologic approaches to leverage the broad range of phenotype data. So additional considerations, our ESP recognized the importance of having a coordinating center that was facile with respect to genetic analyses, and they encouraged us in phase one to use the Coordinating Center more, and we've done that. In phase two, we expect the Coordinating Center to continue to facilitate the centralization of analyses, as well as do some analyses themselves, potentially continue other kinds of scientific coordination that are responsive to the consortium needs and provide logistical support. Because the sample sizes of page phase two will depend so heavily on whole genome sequencing class, we propose that NHGRI decide at the time of funding whether sites should be responsible for the sequencing class or whether the sequencing will be done centrally. And of course, there is a lot of experience within NHGRI already and to come to learn about sequencing and we expect to take advantage of those efforts, including the clinical sequencing exploratory program and the return of results consortium. So there will be two RFAs that are proposed using the U01 cooperative agreement mechanism, one for study investigators and one for the Coordinating Center. The set aside is $5 million per year for four years. And with that, I'd like to just acknowledge the host of people that have made this work and this discussion today possible, and I'm happy to take any questions. Yes, Pearl. Hi, thanks so much. Is there any activity or planned activity for getting more an environment, demographics and environment in addition to pump up the phenotypes? So the phase one of page did not include funds for a data collection and we don't anticipate that that will happen in page phase two. But I can tell you that these are existing longitudinal cohorts and I guess it depends a little bit on what you mean by environment, but we're already looking at, for example, modifiers of SNP disease associations by gender, age. They're talking about doing physical activity and diet and exercise. So there are quite a range of environment. There's diet also, which we are taking advantage of. I guess I was thinking more, you know, poverty levels, exposure levels. That's... The whole gene environment in particular, as you're now going into various cohorts, it seems like that... I think that's a good point. We haven't done the harmonization exercise specifically, but I bet a lot of these cohorts do have that information. It may just be that the expertise has been more focused on, you know, the more typical cardiovascular cancer rates, but that's certainly a direction we can encourage them to go in. Yes. I wonder if you could comment a little bit more on sample size. Yes. It's actually 2,000. Yeah. Right. So the proposed 2,000, we thought initially, when we were thinking about this, that it would make sense to look at two different ancestor groups, potentially 1,000 African Americans and 1,000 Hispanic Latinos, although we want to take advantage of the best opportunities. You know, Page obviously does have tens of thousands of participants, and I think that with the emphasis on whole genome sequencing, which we really felt the need for, you can just imagine how the costs would scale. So we thought that perhaps starting at a smaller number today, you know, 2,000 is today's number, but we could certainly expand on that, especially if there are results that are particularly interesting coming out of the program. You know, I don't know what more to say about the other than this was kind of folded into the strategic plan exercise. And this is sort of new for the population-based studies to be going into sequencing, and I thought we thought that this sort of smaller model with the option to kind of build on it later might be a good place to start at least. But certainly the sample sizes would be there, and phenotype information would be there if additional sequencing capacity or funds are made available. I think I saw what I've seen Carlos' hand next. So it seems to me that, you know, there would be a lot to gain by coordinating with NHLBI on this, so, you know, we were part of the ESP sequencing project, which sequenced 7,000 exomes. I think 2,000 of them were WHI, so many of these participants may already have exome sequencing available, and then there's an exome chip that was produced that was going to get genotyped in a bunch of these samples. So I mean, it seems to me that sort of echoing David's comments, right, one of the things that I think we learned in the ESP project is that even if you sequenced 7,000 exomes, that may not be enough for many of the phenotypes that you want to go after. And so it's sort of unclear, at least in the minds of many of us, whether the next phase there should be to actually genotype many of those variants that you saw at least three times in the entire set, or to do further sequencing to find regulatory variants that would be worth genotyping, but, you know, if you were to ask me how best to invest the funding for 2,000 genomes, even though I'd love to see 2,000 African Americans. I think we'd like to see 2,000 African Americans in Hispanic-Latino Genome Sequence more than I would with associated phenotypes, but I do think that you have to think really hard about, given what you've learned here, what is the next logical design, and whether there are other cohorts that weren't part of the original page that would make sense to also think about and how to structure it so that we avoid some of the complications that folks in ESP have had. I think that's a good point. Actually, I should have mentioned this in response to David's question, but we are trying to look for partnerships with other institutes. We've encouraged the other institutes to attend our Steering Committee meetings. We'll be talking with them about the page renewal. There has been interest, but that's different from commitment, but we'll obviously continue to pursue that. It is something that we consider to do exome sequencing or the exome chip. We did a quick inventory among just the page size to see how many samples had already been exome sequenced or exome chipped. I think the estimate was about 4,000 had been exome sequenced primarily through NHLBI, and perhaps a few thousand more had been exome chipped. That is something we considered. The advantages of ... I'm sorry. I mean, at least both Charles and Kari are both a part of ESP as well, so it's sort of like there's some set of investigators here that make sense. Yeah, there's definitely a critical mass. In fact, they've been very active in kind of thinking about what the added benefits of page would be in a renewal, so that's something we will certainly consider. Mike? Yeah, I'd just echo what's already been said. I think the added value of 2,000 whole genome sequences in the context of all the whole genome sequencing that is being done and will be done is very, very modest. I could really see taking the current exome chip, taking advantage of all the sequencing that's done over the next period of time in these different racial ethnic groups, and constructing another exome chip or even a current exome chip with added content, and you can add, I don't know what the number is, I think 250,000, and still have a product that is incredibly cost effective so that you could stretch dollars an awful lot further that way. I think that's the advantage of sequencing that's going on elsewhere, doing genotyping. I don't think 2,000 is going to make a difference, honestly. If you wanted to tell me you thought you should do 20,000, I'd probably be more interested in that. I don't know where the money had come from, but I don't see what 2,000 actually does. Okay, David? Yeah. Even as we talk about the issues of the numbers, it makes it even more important that we get the racial categories right. So I saw you had a slide with the Asian Pacific Islander category. That includes more than half of the world's population. Many diverse cultures, societies, backgrounds, gene frequencies, even the Hispanic category. There is enormous diversity in what falls into the Hispanic category, so unless we look at maybe if you're also going to do 2,000 Hispanics, then pick 2,001 group of Hispanics so that you get some homogeneity. Even the African American category, a Haitian, a Nigerian, a Black born in Mississippi and a Black born in California will differ in culture, biology, history, a range of things, but they all fall under the African American category. So even with small numbers, we need to be really very strategic about what the racial phenotypes that we use and how we use. Okay, thanks. Good point. Rick? Just echoing what Carlos, Dave, and Mike said, but just a little bit more, I don't think you, I agree that it would be so incremental about sequencing whole exomes or whole genomes on that number of samples, but I think the other reason is to ask yourself why? Why do it now? You can take that money and quadruple or do a whole lot more in the ways that people were just suggesting, but genotyping and making the chips and doing, you know, with existing information and to try to, and then sequence later. I mean, I can't believe I'm saying wait because I'm always in a hurry, but this is a case where you're just not going to get very much and it sounds a little bit like, not jumping on a bandwagon, but everybody knows you ought to be sequencing, but right now it's the phenotype, and these are such diverse phenotypes, you're not going to get anything. You're spreading 2,000 people amongst probably, I don't know how many phenotypes, huge numbers, not to mention, what David just said, that makes it, so I think that it's an exciting project, it's a neat project, and I'm glad it's being done, I think it's being done well. But the thing to do is to make that number up larger to take into account these two or three things that people have made. Right, so I'd be curious to have some input here because we're proposing a four-year project period, so at some point during there, you know, we're going to have to evaluate is exome sequencing continuing to give us all of the information we need, will it be, should we think about transitioning into sequencing, does council have any advice as to what we should consider in terms of making it flexible? Okay. Yeah. So much to do, Carlos, you probably were adding to that. And there could be four full years of doing that kind of work. You know, make this study, use that money that you would use on genome sequencing to make your cohorts ten times bigger, deal with the issue, or some larger number, I don't know. And I want to make the point that, I mean, this is just so important to do, and to do right, that thinking about the design is really critical, and in particular because there is a tremendous GWAS fatigue in all the study sections. So this is the kind of stuff, but this is specifically why you need an RFA, because I think the, if this goes through study section, they're just not going to get it. They don't want to put any more money into GWAS. They just don't think it's that exciting. But in particular, for understudied populations, it's just so critical to do. And that's why, like, oh, you know, if I'm a reviewer at a study section, somebody says I want to exome chip 20,000 African Americans, I'm like, that's not exciting. But I actually think scientifically, it's super, super, super important to do. Right? So I think that's one of the reasons why getting this RFA right is really important, and not saying, you know, I don't really, we're going to do 2000 genomes, or rather to say what are the set of study designs that could come forth that would make the most important advances on what age is already done. And reviewers need to understand this. Yeah. Okay. And just to be clear, the factor on sample size talking here is the order of 100. I mean, it's a huge, it's a huge factor. It's not 10. Okay, thanks. Rex? So given this discussion, though, I wonder, one of the things I think that's unique or different about PAGE is the extent of phenotype data that's associated with each of the individual samples that's present. Is there a strong sense that there would be sort of much utility in terms of, you know, multiple phenotypes from the sample, from a single sample? I mean, one of the things that, you know, we've seen emerge with rich phenotype data behind them is samples get used over and over again for different phenotypes. And does that help at all counteract the question, the value of having sequence on those rich phenotypes? Does that counteract at all some of the concerns about the sample size because of the reutilization of the phenotypes? Well, I think we certainly do plan to reuse. And it's happening in phase one already that the same samples are getting used for multiple phenotypes. And that was the intention. So I think that's a good point. Do you want to add something to it? No, I think perhaps we're hearing counsel's advice that really what we need to do is shift away from sequencing for the moment, not forever, and really go with an exome chip. But not exome sequencing is what I heard. Okay, so an exome chip in these populations, I think we can rely on if some of these investigators come in for this and this will be a competitive process so we might get other cohorts as well. They have been very flexible in terms of, you know, sort of pursuing the most up-to-date and cost-effective approaches for genotype. They shifted from, you know, the original design of 100 SNPs to do the metabachip. And so I think with that, would you be satisfied? The one caution that I would interject is to remember that the cost of sequencing is plummeting. And within the timeframe of this RFA, I suspect we'll see at least an order of magnitude reduction in sequencing costs probably more. So, I mean, I agree with everything that's been said, you can get a lot more bang for your buck in the short term. But I wouldn't necessarily want to see abandoned the sequencing approach wholesale because it is going to get far cheaper. Right. I think that's an excellent point. What happened in the last four? So, let me, could I just ask then, so is the metabachip would be, if we're talking about doing what Mike wants to do with 100-fold, it would be the metabachip? The exonchip. The exonchip. No, what I'd suggest considering, obviously this is just off the top, but I would seriously consider doing the exonchip. But doing the exonchip added material. Oh, that's right. I totally agree with that. The sequencing has been done. Yeah. Because what went into the exonchip was 13,000 individuals. And Kyle, do you remember? It was some... There were 3,000 or so African Americans that were part of ESP. And then there were some Hispanic Latinos that were part of some of the broad sequencing. But I think what you'd want to do is say something like genotype alt-exomic variants that are known to be above 1% frequency in the population or above 0.1% frequency in these populations. And when these chips get designed for very, very large numbers of individuals, they're really cheap, right? Right. So the exonchip, the original price was $39. It's now, I think, 50. And if you want to add 250,000 snips, I don't remember. I think it's still double digits as opposed to triple. And given the kinds of numbers you could be talking about, I think you'd make a really good deal with a woman on this. But I really would wait long enough, and you've got time to wait long enough to make sure that enough sequencing on other groups happens and can be included in that catalog of variants for the chip so that you have sequenced Africans, you have sequenced East Asians or Hispanics or whatever it is you're wanting to focus on, but that is happening. So I'm just now thinking about what Jim said. If the cost does drop by for whatever kind of sequencing, exomer whole genome to tenfold, and it's still going to be way, way, way, way more than that for during this four years. So the idea of sequencing 100,000, you're now talking about getting 100,000 people. Something like that, you said 100 fold? No, maybe more, right? So maybe the way to at least be ready, so it's not going to do any good to sequence 1,000 or 2,000 people, that's what I think the message is. So do the bigger study with the genome piping, making it as cheap as possible and the quality is really high on these, you can get really good data and then have maybe part of the later parts of the RFA do some sequencing to start getting ready for, not to actually try to do it on 100,000. And I don't think you have to restrict health to just exome variants, right? There could be things found by the 1,000 genomes project that are in the loss of function category or whatever, but I think the point is that the biggest challenge in terms of linking genotype to phenotype in this kind of a setting is what to do with all the singletons that you've seen, the one-offs, right? That's always been the really tough thing, particularly seen in ESP. So if you have 100,000 people and you can have genotype variants that you've seen more than once, things that you know are polymorphisms and that's what's on these arrays, things that are seen at least three times. You have a much more powerful epidemiological tool to estimate odds ratio to ask how often are these things shared across populations and you really sort of drive it down into the point where this question of differences across populations can really be tackled and what you're basically leaving out are the really, really, really rare things that may be private to families or to individual groups, whereas you're capturing the rest of the sort of matter that is important. Once you're going to get no purchase on sequencing 2,000 people anyway. So to me this also builds very naturally on what has been Page's huge success. I talk about Page on a fairly regular basis even though I'm not involved in it anymore, I was a little bit of a start. It's exactly what we should be doing to sort of take things to the next step. Once we've identified these variants in genetic discovery sets, we need to see what's going on in much larger groups, much better phenotype groups and by doing this kind of approach you're building on that past success in the same sort of way. Great, thanks. That's a good point. Can I just make a comment just to think a little bit about the future? If we really did know where every element was in the genome and we even if we weren't sure what a base pair change in there would do, you could either sequence or genotype across all of those regions along with what we have now, it would be, I mean I bet you that's where we end up ultimately maybe ten years from now or five years from now or something if we actually knew where those were. But I think having the intermediate approach where you're doing exome chip with added content gives us the flexibility to genotype those non-coding variants of interest. So I think that's a great intermediate. I think we may call the question here. That's good. So I think we've heard your advice about the emphasis in the RFA being more on genotyping rather than sequencing at least in the early part. Bearing that in mind, can I get a vote for concept clearance for the patron all in favor? Any opposed? Any abstentions? Thank you. Thank you very much. Sorry to do this to you. We're going to give you a break because the cafeteria upstairs closes at three. I do not want to stand in between the way of counseling and coffee. So try to be back by three o'clock.