 Right, and as both Eric and Rudy have mentioned, this is a continuation of a discussion that began at the September council round when we brought two concepts that you approved, one that you asked us to come back and reconsider, and we previewed a few more. So I'm going to talk about the ones that didn't go through last time. This is just a reminder that everything I'm going to talk about comes out of the July 2014 strategic planning workshop we had, which covered the topics that you can see up there. In the big blue bubble is disease gene invariant discovery across architectures and across designs, clinical applications of sequencing, genome function, and related informatics and related technology development came up as well. Coming out of that wish list, and I should say, I'm going to go through this background very, very quickly because you've seen it before. But if I'm going too fast, just stop me. I'm happy to slow down. So coming out of that workshop were a list of items that participants recommended for high priority attention, again, including this area of disease gene invariant discovery, clinical applications of sequencing, creating a virtuous cycle between clinic and discovery. They identified a big need to enable capture, interpretability, and analysis of the world's sequencing data. They also thought we should pay attention, continue attention to genome function, especially related to interpretation of variance. And a few things that we didn't really expect so much came out as well. One was to produce additional very high quality human genome sequences to aid references. And another was to re-emphasize comparative and evolutionary genomics. I should say that a lot of these already have their own programs, their own centers of gravity, their own planning efforts that are ongoing. So I'm not going to talk about those further. So clinical sequencing, the informatics, and the functional genomics. And even this gold genomes methods development, which arguably falls under our technology development program, I'm not going to talk about those. But I will talk about all of these. We brought this idea in the form of these two concepts to September Council. There is RFAs on the street, as Eric mentioned, for Centers for Common Disease Genomics, and one for Centers for Mendelian Genomics. And we're looking forward to seeing the applications in April. So that brings us up to the current. I will talk today about the rest of the concepts, the genome sequencing program analysis satellites, the genome sequencing program coordinating center, which together with the first two initiatives really round out the core of the genome sequencing program. I'm also going to talk about this gold genomes idea, it's very high quality references idea, and comparative and evolutionary genomics. And here is what I said just in text format. We have the first four concepts, RFAs, being the core of the sequencing program and the two others. I'm going to talk about items three and item four separately. But I do think about them together sometimes, because they both grow out of even just a quick consideration of both the Centers for Common Disease Genomics and the Centers for Mendelian Genomics. And it's really simple. There's going to be a lot of data, and that means a lot of analysis opportunities, and that's concept three. And there's going to be a complex structure, and there will be many coordination needs. In September, as you know, I brought both of these ideas together in a single concept. And you asked us to rethink that based on a couple of things. First, there were concerns, as I understand them, there were concerns that both of these functions would be hard to get in one group. And another was simply because of the way that, because of how much data was going to be produced and the kind of data it was, we were just going to miss an opportunity for getting more diverse analyses. So that was creativity opportunity. And then there was also this opportunity to sort of disseminate the data out of the big centers faster if we had analysis, separate analysis groups. So that brings us to the Genome Sequencing Program analysis satellites concept. They were proposed and carry out novel creative analyses of the data produced by the Genome Sequencing Program as a whole. But these analyses will cut across individual projects, and there will be many projects going on within the CCDG and the CMG, different grants, and even programs. So although we want the proposal to be founded on the use of the data coming out of the program, we're not going to limit applicants to only analyzing those kinds of data. There are other programs, there are functional genomics programs, there are other big data sets sponsored by Common Fund and elsewhere at NHGRI. We want all that to be fair game. And we also want them to help with cross program analyses that we can define in advance or with the program as the program is ongoing. And we don't want them to do routine data processing. We think that's taken care of. So just to elaborate, there will be two kinds of goals. And one that the satellites have and one of these outward looking goals. And they consist of improved or novel analyses for non-automated aspects of characterizing sequence variants in the data, anything after variant calling. So that's a very sort of a lower limit. But we have particular interests and questions about association analyses and using existing functional data to help make associations, et cetera. Improving study design to increase power, other higher level analyses. We're not going to limit these. These first two bullet points, this sort of outward looking function of the satellites will be investigator driven. But then the second two bullet points have to deal with tasks that are program driven. So there are questions that go across are likely to go across the entire program. So for example, for the common disease program, when is a common disease study really comprehensive or complete? We're also interested in characterization and specification of sample sets that could serve as common controls. And we want the analysis centers to contribute some of their muscle to those cross program tasks as well. Whenever we present a concept, we like to talk about the relationship to other NHGRI activities. And that's important here. We do fund other sequence analysis activities that are investigator initiated and some that are initiated by us. For example, some of the analysis component of 1,000 genomes, the FUNVAR program, we're going to have to make sure that this is differentiated from those. We're going to have to look for overlap and try to avoid it. We will encourage proposals that will take best advantage of the genome sequencing program data and again discourage those that would do the other efforts. And of course, the satellites will be an integral part of the program as a whole. As for mechanism, we feel we need cooperative agreements to facilitate coordination. There are cooperative agreement mechanisms that allow investigator initiated components. Investigators and this is critical and I would like to hear some discussion about what the boundaries are similar to what the discussion was just before. We would like to make investigators and key personnel from the large sequencing center grants not eligible to be funded in order to encourage dissemination. But there are all kinds of variations of that can come up institutional, co-PI's, things like that. We have to think our way through. And we will write the announcement to increase the chances that each of these grantees will be working on a slightly different area of focus. We propose to fund this at $3 million per year for four years. We think this should be plenty for three to four awards. And we would like to start these as soon as possible to be in time with the rest of the program. So I'm going to stop there. There are a few council members that I talked to in advance in depth about this. I hope you all got a chance to read the concept document, but I'm going to call on them first for discussion. So I actually have to start with Joe. I had a question about the minimum or you said that the, let's say, so if the sequencing centers provide variant calls, then the analysis groups wouldn't redo that. But we already heard from Eric, I guess, about the intersection among the calls or, for example, for the AD samples. And so how will that work if you're using data from one center that has one set of calls? Yeah, the data, it's hard to say in advance without some specific examples about where it will be needed because in the ideal case, all the analysis centers, all the analysis satellites will have access to all of the data. The reality is that some of the projects are going to be like, for example, they're going to be like ADSP, they're going to have their own arrangements for a data repository. It will be possible to get the data from, in the case of it were ADSP, from GenBank and the NIA center. There may be other arrangements that are like that. There may be other times where the analysis satellites have to actually go to the centers directly to get the data. But I'm hoping that, by and large, most of it can be within these large projects and there will be provision for the data, but it is a really important problem. It grows with a number of analysis centers and it does have to be dealt with. Yeah. Dan. Well, it seems like a really trivial comment, but I dislike the word satellite. It has this sort of sense that the sequencing centers are like the center of the universe and everything revolves around them and the satellites are sort of afterthought almost. And I really think that if we're going to put all this money into genome sequencing, that this is a really, really critical initiative. And so I wonder, I just wish there was a better word. And of course, when I thought about making this comment, I thought to myself I would be smart if I had a better word and I don't. But I put that flea in your ear before the RFA gets finalized. Eric. And I have a bit of a high level comment. I'm concerned, and it's not to you, it's to us really, that we don't commoditize science too much. That really these major national initiatives, if they raise to that level that important, I think the key to their success is attention from A to Z. You know, study design, procurement of the samples, definition of the phenotype, this details of the calling, the QC, the analysis, all the way through. And if we break this up into little bitty parts and we fund those little bitty parts separately, I'm afraid we're going to lose the cohesion that would make some of these extremely difficult problems tractable. And that's not a criticism. It's an observation. So I just think we all have to be careful that we don't over commoditize this. And the other thing that happens when you do commoditize something, everybody thinks what they're doing is the most important and what the others are doing is less important. Where indeed what is important is what brings everybody together, working on that unified goal. All right. I certainly, in discussing the tasks of the CCDGs and CMGs versus analysis centers versus coordinating center, I view these as overlapping, right? I do think that the, again, the analysis satellites as they're proposed have a little bit more leeway to be more outward looking, more venturesome. I imagine that a lot of the centers are actually going to be paying attention to individual projects and some issues over. And I certainly, the way I envision this working is much the way that it already works when there's any collaboration that's functional is that everybody that's interested in a problem is working on it together. And I think exactly for the reasons that you state, that's the way to get the most out of it. Carol. So I actually had a different take on it, which is I completely agree with you, Eric, but we also need to invest money into groups that are going to take these data sets and do new novel things with them to really push the boundaries of how we use computational approaches to do data-driven science and hypothesis generation. And I think, so I think we need these innovation centers like this, these computational innovation centers to do that and they have to be funded. And I think it's important to do that in addition to funding the A to Z, the end-to-end projects. So I like the concept very much about bringing it in and expanding it. But the question is, what do you do with balancing this off with the sequencing centers, which also are going to want to analyze the data and go forward on this. And so I'm just wondering out loud if, well, you're going to have $3 million, if you will, for the general pool, for people outside of the major centers. I almost wonder if it's not worth having an additional $3 million that comes out of the centers that they then also compete. So think of it as a $6 million pool. So we get the very best ideas coming in for how to analyze the data. And looking at something like this, because I don't want to have the sequencing centers not participating, but I would like there to be some type of a competition. So the very best ideas are able to get in from both sides. And look at this as well as the standard way of looking at this from soup to nuts, like Eric was talking about. And I don't know if that makes sense. But so the idea would be that the sequencing centers are also competing for some number of dollars that are coming out of their budget for how to analyze the data. And then there's another amount that expands it beyond the sequencing centers. So in essence, you're getting a lot of good ideas that come in and you're funding across both sides and have a way of comparing them versus trying to put it together later on. Does that make sense? I think the intent makes sense to me. I immediately start thinking of mechanisms, mechanisms to do it. And I would have to think more about it. I'm not thinking about possible. Because I just hate not having the sequencing centers being able to think about some of these interesting analyses, but I would like to be able to expand it. So. But there's nothing that would prevent somebody from collaborating with somebody in a sequencing center. I think the thing was that they couldn't be the PI, but it doesn't eliminate collaboration. Is that correct? That's correct. And I mean, the sequencing centers are going to have, as part of their entry into the competition, need to demonstrate analysis capabilities anyway, right? So the way I view this RFA is as broadening the analysis group to include people that are outside of sequencing centers that can bring expertise to what are going to be very large collaborative projects anyway. We imagine there are going to be a bunch of people who want to join the analysis groups because they want to join the analysis groups. That's the way 1000 genomes and other kinds of projects have been run. So I think it's to create the infrastructure to make that happen. And so that the sequencing centers aren't potentially viewed as sort of being sequestered and sort of on their own, but rather in these kinds of worlds, as Eric was pointing out, where we really need to bring everyone together, then you're going to need to have some funds allocated to an analysis center to make that the whole consortium basically happened. But not all the analysis that's going to get done for the project is going to just be done by the people who are at the sequencing center, even at the satellite analysis sites. My guess is that these will be pretty broad, collaborative groups that are going to come together to tackle this because that's what we need to do. Elana. So as you describe the four bullet points and objectives here, the first two seem like they're largely, innovation could occur being supported by any number of mechanisms outside of this RFA. And it's the last two that are actually really relevant where added value is here because the groups would be sharing data, sharing common problems, meeting to discuss what works and what doesn't, so on. And there's a lot of precedent for that, going way back before GWAS days even, where that's really added value for exactly the reasons that were raised here, not as a replacement for what's going on in the data generators, in this case the sequencing centers, but as an adjunct to it. And that seems like, that seems really, really valuable and it adds tension and it just raises the bar for everyone. So I think that's really exciting. I think the next topic you're going to raise is the harder one because how do we get them to share those data and get them on a common footing where there is really added value here for the, to maximize what the analysts can have? But that's why I thought that a potential competition, if you will, in granting mechanism around that, you in some sense establish that as part of the review process, that instead of everybody coming together then figuring out how we're going to do this, you're having some type of a competition with the sequencing centers and with the outside centers to come up with the analysis strategy. So in essence, review is helping to drive the priority. So I think that could happen with the scheme as it exists. In fact, I'd expect some of that to happen, but you were taking it a step further and trying to raise more matching money to do that, weren't you? Which is interesting from the centers themselves. So Howard, maybe I'm not understanding you. You're talking about an initial competition or an ongoing competition because my concern is as soon as you put money up for grabs, you're not going to get people collaborating. You're going to get silos going up. So I'm thinking of an initial competition. So that you're putting together, you know, think of the common fund, if you will, as an analogy. I don't know if that's a good analogy or a bad analogy of those resources being pulled out of the centers and that they're competing for those and then the people that aren't part of that are also competing for those dollars and we're getting the very best analysis ideas from inside and outside the community that are reviewed, they're competitive and then from that point then we can start going forward. So I know they have analytical capabilities. I mean, that's why we want these people to go forward on this, but I'm just trying to figure out a way to do it a little bit differently and maybe it's a bad idea. Yeah, and Howard, just to make sure I have it right, you're talking about some portion of the funds being reviewed internally to the system rather than bringing in additional groups. I said three to four additional groups because I'm worried that if we go to five or six or seven it will be very hard to coordinate. So. But Adam, in this context, it might be worth discussing the timing. You know, like when will these, what will we know about the production centers at the time people are applying for this? Yeah. Yeah. So that's one of the reasons, one of the things that makes it tough is it would be in some ways easier to know if you knew, if we knew what all the projects were ahead of time. And we may not have that luxury. I actually had two questions. One is, is the TCGA a good model or a counter model or anything for this business of having the sequencing centers doing some work on the analysis, but at the same time having a central group that's pulling things together and also going beyond just what the sequencing centers are doing? Yeah. Actually, I think scientifically, I often think about the pan-cancer analysis as being an example of cross-cutting analyses that I know how they arose, but it's the kind of across-disease analysis that I find very exciting. And it's an explicit intent of the RFA's, especially the Common Disease RFA that's already been issued, that that happen. And as I talk about the coordinating center, I'll get more into that. So, I mean, I sort of envision this, that each sequencing center is going to have an important software component and an important analysis component, but it won't go all the way and that this coordinating center will be a tremendous value added because it'll cut across all of them and perhaps go further. So, we haven't gotten to the coordinating center yet, but I'll get to that. But the analysis center certainly could provide some muscle for that. That's what I meant. I meant the analysis and not the coordinating center. And I actually don't look at it as a clean divide, I think, as you're presenting it. I think that it will be overlapped. There will inevitably, there will be good people in every one of these funded components who are extremely interested and motivated to work on a problem. I didn't mean to suggest that it was a clean break. In fact, quite the opposite. They should be bleeding into each other. That's exactly it. And, Adam, can you clarify whether the, to get at Howard's point, that the applicant can't be the same PI on those grants, but it's not like individuals that have that expertise from those institutions can't apply, right? I, in this case, there are a couple of considerations. I'll just lay out the principles that I understood from the previous conversation and what I've brought here. We definitely don't want the PIs to also be of the CCDGs or the CMDs to also be PIs. We would, I think there's a principle here of independence, of intellectual independence, that's important, even though they're all going to be collaborating. And I think that there's a second principle because that results to the more creative science. And the second principle that I briefly alluded to was the issue of pushing things out from the sequencing centers. And to this extent, that a sequencing with CCDG is very strongly identified with an institution, there could be, it could become a factor is the way I'm thinking about it. But the question is whether to to draw really bright lines or to draw some bright lines and state principles and wait for funding decisions to do that. And that's what I'd like some council input on. How strong is the principle of pushing things out away compared to getting the best science and the other criteria of just intellectual independence? Well, I do think you should be clear on whether you're going to draw that line deeply in the sand or just etched because the people in the sequencing centers will have a competitive advantage. They have the data, they have familiarity, they're already doing a lot of this. So if we don't want them, if we want to push it out, then we should push it out. But if you draw that softly, it's going to be a hard thing to undertake at review. My sense is that a successful applicant for the large scale sequencing program will already highlight their internal computational capacity, what they're going to bring to the analysis table, and the point of this RFA is to bring in people that wouldn't otherwise be at the table. And I would also encourage going forward because, as we might imagine, this will change over time in terms of the way the analysis is going to actually happen to be open so that even if people aren't funded, they can participate. And there's a strong culture that already at NHGRI, again, I mentioned projects like thousand genomes, but even in ENCODE and ClinGen, we've got people who aren't funded investigators but want to participate and contribute. So I think there's a lot of ways of making that happen, but I personally would agree with Lon that if you want to push it out, then push it out. Push it out, okay. Yeah, Jay. So I don't know if this is the right place for this comment, but one of the things I liked in what was written here was this notion, and I think it's both in here in the coordinating center of these centers being tasked with deciding when a study is complete. Right? And just kind of stepping back, I think looking back at kind of ARO-funded studies and things like that, I think one thing that's been challenging in this era of exome or genome-based studies is getting negative results published. I think they're important to report and to understand the basis for failures and to, you know, lens insight into what we do in the future. But there's obviously not much incentive for people out there doing the studies to publish negative results. And I just wonder if this can kind of be expanded to include not just deciding when things are complete, but also ensuring that negative results get reported so we can learn from them. Yeah, happy to do that. And that's also would become, then, a task or item for attention for the coordinating center as well as I will present it. Yeah, but a similar comment was strongly made when the first two concepts were approved. But that has to be a feature of the program. Yes? Yeah, I agree that the rules have to be set up upfront because then people apply or don't apply to things and don't get into that situation in which they don't know what kind of collaboration it's expected with each other. I think that this needs to be formulated a little bit more crisply. There was some language in one of the slides about data that are not coming from the center that they could do that too. I think that's a little confusing too and perhaps not the point here to confuse because if it's part of a program to deal with data from that center, then that's what it is. And not additional data coming from elsewhere. So one of the things that we'd like these centers to do, the analysis satellites to do, is actually to be more outward looking, to have the luxury of being more creative and more outward looking than just the centers. So while we wouldn't accept an application that didn't propose to work on the data coming from the sequencing program, we would very much like to have proposals that are creative enough that they feel free to use to incorporate other kinds of data. For example, functional data and other types of data specifically to make, for example, the identification of associations better to be able to make functional inferences as a means to improve power. We don't know, we've heard over the last year and from within the program and other places about combining and see it going on, we see it even going on in projects already, where people are using functional information, for example, from the ENCODE project to help them make inferences about the variants that they'll find. And I think that there's a lot of room for creativity and advance in that kind of area. It wasn't meant to say, this is what you're going to focus on alone, this is the kind of data you are free to bring in to your analysis. I think my comment was just if they're supposed to collaborate and if there is a set of mandates for that collaboration, then that needs to be wasted against the creativity as well. I'd like to invite comments from people on the phone, Val or David. You want to jump in here? Good. Can you just briefly comment on the distinction between what the analysis centers are doing and kind of what's being set up in FUNVAR? Yeah, Lisa, are you here? FUNVAR. We've had some complaints that those remote mics aren't working. So FUNVAR, as you probably all know, is designed computational approaches to try to figure out actual causality, which variants are causing signals under peaks with experimental validation. And we realize causality is a really strong term. So the computational approaches don't necessarily have to prove causality, but we're sort of trying to get in that direction. So it's a specific type of issue that's being addressed by these. So we haven't made the awards yet. When we do, we'll have a little consortium that'll be meeting every month or two and in person once a year. So it's going to be, so in terms of the questions for this, I don't know how much, Adam, you meant for it to go into causality. I think it's possible that somebody could propose to do that. I mean, what we've been thinking of, because we have been getting the occasional other application outside of the FUNVAR RFA that is interested in causality. I mean, good, it's an important question. One of the points of doing an RFA is to stimulate interest in the field that we think is important. And so then we get these unsolicited R01s that are actually coming in with this area. So our feeling was that we would simply sweep in. Anybody who's doing that, we would just put it into the consortium because it's supposed to be a collaborative consortium where people exchange ideas. It's not any sort of special deal. Anybody who wants to talk about it is welcome to do that. So I think that if out of this particular RFA, there was stuff with causality, they'd be welcome to sort of be part of the FUNVAR discussion. You know, as you know, that's a huge problem. So I think the more research, the more groups that are working on this, the better. Does that answer the question? Other questions for Adam. David Page here. Go ahead. So just a point of information. So in this clearance and the next, there's some focus given to this question of defining when a project is complete. And I guess what wasn't clear to me is what action follows upon such a decision. And just to be, to make some light of it, I don't know that I've ever worked on a scientific project that was ever declared complete. But if you were to put such a focus on declaring a project complete, it implies that there's some action that's motivated by such a decision. So what would that action be? Right. So this was when the CCDG concept was presented, it was in there as well. And that is that we have an intuitive notion of when we're done finding all the variants that underlie a common disease. It seems simple to state, but if you think about different disorders, if you think about technological limits, if you think about how far you want to go with effect size or practically how many individuals you can sequence, where what you find is no longer making a difference, even if you could put in another 10 million dollars. This is a question that has an approximate solution. But as was stated in the workshop, I think Mike Banky said it in the workshop, we may have to go farther than we think in order to know what the right stopping point is. So it's also a research question, and it's something that can only be addressed in consideration of a number of designs, in consideration of a number of different disease or presumed disease architectures. So while we expect applicants to have an opinion about this when they apply, we also understand that our knowledge of it or reckoning of it will change based on scientific and practical considerations as the program produces data and analyzes it. So that's really all that was meant by it. It's not a definitive notion. It's an intuitive notion. Some people have already heard people propose very concrete ideas about what saturation in such a study means. And that's a possibility as well. Eric Borwinkel, do you? That I would give a very practical answer to the question, it's when NHGRI is no longer going to support it. Because it could be that the disease-oriented communities and ICs will pick it up and run with it because they have a different purpose in mind. It could be pharma would pick up those findings and run with them because they have a different purpose in mind. Right. So I would avoid saturation and talk about completeness in a very practical setting for the large-scale centers or these various programs. Because we've seen, many of us have seen and showed these kinds of curves, the more you sequence, you're going to find more and more variants, many more, and they'll be associated with disease unless we basically sequence a large enough sample, where indeed every base is variable. Which I don't think is on the table, at least right now. Ready for a vote? Can I have a motion to approve the concept? Second. All in favor? Any opposed? Any abstentions? David Val and Amy, why don't you just email your votes? Okay. All right. Okay. Phone folks, I think we're going to take a break now. Let's keep those caffeine receptors saturated. So go upstairs and be back about 3.10.