 All right. We're due for a break, but I would like to do this last concept. Counsel, are you on board with that? Okay. Adam, come on up, please. So Adam is going to present a renewal for the Human Genome Reference Program. There's actually three separate FOAs as part of this package. Hello, everyone, and thank you for the opportunity to present this concept clearance for renewal of the Human Genome Reference Program. Before I start, I want to thank my colleagues who have worked on this and contributed to every aspect of it, including Xander Arguello, Sarah Curran, Caroline Hutter, Nikola Lockhart-Hydates, Sophia Mike Smith, and Chris Litterstrand. So it's a basic background for what I'm going to say today. This is just a cartoon trying to get at the main purpose of the Human Genome Reference Program. The basic goal has been, and will continue to develop an improved human reference sequence. At the top, this is sort of representing the current version of the human reference. The current version has no haplotype information. It's basically multiple haplotypes stitched together, different ones from the same person as well as from different people. It has some gaps, and frustratingly, often, we're structurally interesting regions of the genome or. Not most importantly, it has very little population diversity that is represented. So most of the reference, again, is from a single individual. The rest from perhaps a dozen or so others. This is still good. It's very useful. It's highly supported. It is updated over time as gaps are filled. It has excellent annotations, very widely used for basic and clinical research, and you know it as GRCH38, but it is not good enough. So what would be good enough, something more like this, that has no or few gaps, has individual haplotypes resolved from each contributing individual, faithfully represented, and key to a useful human reference is the opportunity to include genetically diverse individuals in the collection of the genomes. If we can have this, the reference will be more representative of human haplotypes, better capture variation present in human populations, especially structural variants. It's likely to make the reference more useful for basic and clinical applications and less likely to lead to the reference bias and potentially health disparities down the line. But it's not enough. You need a way to represent many genomes at once, to conceive of them, to think about them, and most importantly, to compute on them. And this is one way to think about how to represent multiple genomes, at least conceptually. Regions where genomes are the same only need to be represented once, where there are SNPs or indels, the genomes containing them are represented separately for just that region. Inversions, duplications, and other structural variation can be represented as well. A reference resource that represents many genomes at the same time is called a PAN genome. So a little bit about the current program. In 2019, we started the human genome reference program to move towards a better human reference. The goals included to build a multi-genome reference or PAN genome resource that adequately represents human variation, with an explicit goal of adding 350 individuals to the reference, to develop better ways to represent a multi-genome reference, to build and maintain a useful PAN genome resource, including outreach to the community, and to further develop methods to sequence and assemble very high-quality genomes. Since we started that, we later added an embedded ELSI component across multiple aspects of this effort, and a supplement to fund the encouragement of international partnerships. Here's a visual outline of the components that were funded to carry out this work and the initial iteration of the HGRP. The Human PAN Genome Sequencing Center identifies and brings in new samples, sequences and assembles them to very high-quality, develops standards, and adopts consistent methods for producing high-quality genomes and genome assemblies. General awards for genome reference representations component, they do R&D for developing the computational means to represent multiple genomes, and so far these are graph-based or related methods. The third main component is a Human PAN Genome Reference Center, which is responsible for bringing program components together to make the reference usable and available. It is also responsible for coordination and outreach. There is an ELSI component embedded throughout. It's involved in practically all aspects, going well beyond consent and data use issues to include sample prioritization, considerations about population inclusion and naming, international outreach and outreach to indigenous populations, and anticipating issues and using a PAN Genome Reference in a clinical setting. There is also a high-quality genomes tech development component. The grantees together form the Human PAN Genome Reference Consortium, or HPRC. Their funding in the last year is approximately $8.7 million, approximately $8 million of that for the first three elements and the ELSI component. The grantees of the program have also established important connections needed to make their effort work for data production. There is an active collaboration with the telomere-to-telomere consortium for improving the state of the art and producing very high-quality genome assemblies. The grantees also work with GA4GH to help establish international collaborations and also collaborate directly with the Genome Reference Consortium, EBI, NCBI, Anvil, and others to help deliver the resource through multiple activities, including annotation, delivery to the community, and data availability. The HPRC is not just the institutions listed on the previous slide, but also includes subawards and collaborations with investigators from Stanford, University of Washington, Corey L. Dana-Farber, Mount Sinai, Anvil, Sanger, and the Max Planck Institute, and also not shown on here are associations with a number of associate members. Progress has been very good. These slides are taken from a presentation to Council in September, which some of you may have seen from Deanna Church, and that's available publicly on the materials from September Council. But the current program, in the middle of its fourth year, has built a solid foundation for this work. The investigators have produced long-read, high-quality genome assemblies from over 150 individuals, and are likely to more than double that in the next year and a half. Everyone wants to know about sample diversity. The samples for the effort so far have been selected from a reasonably diverse background, all from the Thousand Genomes Project. Non-Thousand Genomes Project samples for year five are being recruited. Excellent progress has been made on developing useful pan-genome representations. This is just an example of a detailed graph view of a well-known structurally challenging region of the genome. Results will be published soon on the initial release of a draft pan-genome reference, including a subset of the 150 that have been sequenced so far, showing how these methods are performing and how the pan-genome reference can be used. That there is a preprint that's in bio-archive, and the link is in the text of the slide. Moving on from the background to the concept proposal, the current NHGRI program will end in 2024. Now's the time to consider whether we should renew the program, and if so, what changes should be made. To start this off, in October, we hosted a workshop to survey the genomics community about these considerations and to provide other advice to us. The workshop report is available, linked from the slide notes here, and also from the text of the concept document. And it's also on the HGRP web pages. The workshop covered four main topics, samples and sequencing, representation and implementation, dissemination, and engaging worldwide partners. At a high level, the participants thought it was important for NHGRI to continue to support the human genome reference resource. The extensive and detailed discussion at the workshop is here oversimplified and reduced to these four high-level points, but these are, I think, the most important points. Participants indicated that we should stop thinking so much in terms of the number of additional genomes that need to be added to the reference. If the goal were to be to capture all human variation, then that task is potentially endless. Instead, they advised shifting the goal to emphasize utility by demonstrating benefit to a broad array of genomics researchers and clinicians. Following that reasoning, they advised that we emphasize adoption and implementation during the next phase. This can be done by identifying key adopter projects to demonstrate benefit for particular use cases. These could be, for example, intensive short-term collaborations with other genomics consortia that are motivated to adopt the pan genome reference, and so on. Along with that, they recommended an increased emphasis on developing informatics tools for use of the pan genome by the community. Participants advised continuing to establish partnerships with international organizations to maximize the chances to achieve equitable benefit. And finally, they advised that LC considerations be considered at all stages, including project design, improvement, adoption, outreach, and access. This cartoon is a visual abstract of what we're proposing for the next iteration of the program based on that advice. This top line represents the current flow of the program, from sampling, through sequencing, and assembly, through pan genome development, to community engagement and outreach on the right, and with everything supported by Embedded LC. The outline of the new program will be very similar with some noted changes in green. New samples will be collected, additional high-quality genome assemblies will be added, a substantial Embedded LC effort will still be included, roughly at its current level. For the new program, we want to significantly expand the outreach elements there at the very end, and particular development of computational tools to allow broad use of the pan genome reference and the facilitation of adoption by the community. Consonant with this, the goals will also be refined to emphasize utility of the resource and not just adding additional genomes to completely represent human variation. At the front end of this process, the number of genomes specified will be more incremental in order to help us learn how to connect the needed number to the new emphasis on utility for the community. We know that more R&D is needed for both technology development for high-quality genomes and also for development of pan genome representations. But investigators interested in this aspect will be encouraged to submit to our regular investigator-initiated programs in technology development and informatics as the pan genome becomes more accepted as a routine resource for doing genomic research. For the specifics of the program renewal, we propose three FOAs. The first will be a human pan genome reference center, and this will really be the slight, the coordinating center from before. They'll construct and release new pan genome reference versions, implement state-of-the-art reference representations for community use. There will be a new emphasis on community outreach and adoption for use of the reference, which will use an adopter projects model and also other engagement and training. It will develop basic tools and informatics infrastructure for applications needed by all for use of the pan genome reference. This might be something like lift-over tools and aggregate those and other tools. They will coordinate outreach and collaborations including international ones, and they will liaise with other genomic resources that do complementary work, so the Genome Reference Consortium, NCBI and EBI, ANVOL, et cetera. They will be the Logistic and Scientific Coordinating Center for the NHGRI program, and this is proposed as a U-Mechanism Cooperative Agreement with one award at about $3 million per year for five years. The second component will be an FOA for a Center for High Quality Reference Genomes, and this will be the production, essentially the High Quality Genome Production and Assembly. So they're going to identify, recruit and collect new samples or possibly use existing data sets if they meet quality standards, establish criteria and metrics for prioritizing new assemblies for the pan genome, balance scientific and social priorities for selecting samples, consider quantity and quality, and emphasize utility for basic and clinical studies. We propose that they generate reference quality sequence data for at least 200 deployed genomes. This number may change based on ongoing findings and changing sequencing costs, and this component will also support the embedded LC research into the creation and use of a human pan genome, including topics related to consent, equity and diversity, sample recruitment, selection, population naming, regulatory issues, data privacy and sovereignty, regulatory aspects of clinical use and engagement with international and indigenous communities. And we're proposing this as a cooperative agreement as well with one award at $3 million a year for five years. A new FOA we are proposing to address the high quality need to build informatics tools for use of the pan genome reference. Emphasis will be on tools for common use cases that are relevant to different broad sectors of the genomics community, for example, clinical genomics, population genetics, or functional genomics. Possible examples, including selecting the best subset of linear genomes or paths along the graph for a given set of samples that are being analyzed, visualizing complex variation in disease-associated regions, annotating functional elements, and others. These tools will complement those developed by the human pan genome reference center with the latter, again, focusing on general infrastructure for pan genome use. We're proposing this as cooperative agreements, four to six awards, $2 million total in year one, rising to $4 million total available in years two to five, and again, multiple receipt dates. It's almost impossible to just list a few things to talk about the relationship to ongoing activities, because there will have to be this effort will have to be well coordinated with lots of genomics. The consortium will continue to pursue relationships with international organizations with common interests like GA4GH and H3Africa and others. The general intent is to position HGRP awardees to be leaders in developing a global community. The consortium will liaise with other large genomic resources, as I've mentioned before, ANVOL, EBI, NCBI, and others. The consortium will encourage associate membership for independently funded investigators pursuing common goals. And they will be expected to leverage existing investigator initiated programs that could mesh, for example, technology development, genome assembly, pan genome representations, even healthy research. And we've thought about issuing a notice to stimulate some of these in that context. We propose issuing the human pan genome reference center in the Center for High Quality Reference Genomes Components, those first two components, as limited competitions. These integrated centers have laid a solid foundation for a community pan genome resource. They've created an open community around the construction of the pan genome reference. They established numerous key relationships and collaborations that would be difficult to reestablish. This includes the embedded LC component and the policies that they've set, interactions with associate consortium members and international partnerships. Rebuilding this community would cause delays. I emphasize that these applications would still be peer reviewed, would need to be well received by our viewers, and would be subject to negotiation with NHGRI prior to funding. There are, of course, trade-offs here. We don't want to rebuild the foundations of a pan genome community. We do want to rebuild the feed foundations. Yeah, sorry. We don't want to rebuild the foundations of a pan genome community resource, but we still do want to encourage new ideas from the community, and not just for the sake of the new ideas, but also to keep the community close to the resource for honest feedback. We think we can do this in a number of ways. First, the tools RFA will be an open competition. Second, investigator-initiated applications on topics related to the pan genome, as I've mentioned, higher quality genome assemblies, basic work on pan genome representations, would still be responsive to our general program announcements for investigator-initiated work. These can be brought into the consortium as associate members, and I note that investigators who are not funded by HG at all can be, in fact, already our associate members. The adoption projects I mentioned also will bring in new ideas. So I'm going to wind it up here. This is a summary of the proposed budget, some things to note. Again, the tools FOA would have multiple release states and go up to $4 million per year in the second year and stay at that level. We would need to decide whether an additional round is justified in, say, FY28. I can think about it closer to that time. Separately, I should say, we would like to look for opportunities to provide supplements for potential adopter projects. So the Pan Genome Reference Center will have plenty of funds for some of the adoption projects, but we think it's important for the adopters to also have funds for this as well. And it is our intent that adoption projects will not be solely from NHGRI-funded consortium. So I will stop there and ask for discussion. The two discussants are Lynn Geordie and Steven Rich. Lynn, would you please start? I'll start off by saying I'm just generally very, very supportive of this concept. I'd like the fact that... Lynn, your sound is... You froze on us, Lynn. Here we go to Steve. Yeah, let's go to Steve, then. Steve, you want to step in, please? Sure. Not knowing exactly what Lynn was going to say next. I'll start from scratch. I'm very supportive of this. I think that the Pan Genome, if anything, is sort of the forefront of genomics in that sense. I don't see any other institute doing this. The established relationships are important. The progress being made is important. I think one of the things that will need to be thought through a bit more carefully, especially when you start getting into a larger approach to understanding what this is going to evolve into, is that there's going to be support for developing tools for the Pan Genome and how it can be manipulated. I think it's going to be really important to have support for a leveraging of those who use the Pan Genome and how it can be used. And that may actually imply that you need to have some sort of way for the Pan Genome Consortium to really reach out to those who develop analytic methods and tools for the analyses. Because otherwise, you want to build a community of users. And I'm biased, but I think a lot of the users of genomics these days are those who are doing standard disease discovery, variant defunction, co-localization based upon existing genomes. And you need to get those people involved. Otherwise, there'll be an interesting technology that doesn't really advance the use of it in the field. So I think those are the key ways of, you have to figure out the key ways of doing it, whether they're case studies, developing workshops, as far as the educational dissemination component. One question I had, looking at the original diagram that you showed, you also had it in terms of the Pan Genome, we're going into, it's really epigenomics with a tax seek and other things. As far as I can tell, that really hasn't happened at this point, although it does make sense in many ways, but it seems like it would open up a whole another area that is like another entire set of work that maybe would be sort of taking away from the focus on the use of the Pan Genome by developing a Pan Epigenome. So I guess that's sort of the thing that I was wondering. But now that I see that Lynn is unfrozen, I can turn it back to him. Thank you, Steve. I apologize, Zoom kicked me out at the most inopportune moment. Sorry about that. Steve and I have discussed this a little bit beforehand as well, so I know I agree with his comments and perspectives. And I was starting to say that just in general, I support the concept of the Pan Genome Project. It's inclusivity, not just of a greater diversity of people, but of the entire human genome. So I don't think I have anything else I need to say at this point. Thanks. I do want to briefly respond to what Steve said, so thank you both, and thanks, Steve. I know that there's a lot of different things that can be done by way of outreach, and of course we'll ask applicants to propose those, but it would be good if we went, if we had an idea of what the successful ones were likely to be before that. So that's helpful. And as far as the Epi Pan Genome, of course I've heard the idea before, but I think that just the Pan Genome and getting this out and getting the community used to it is a big thing on its own. Tim, I saw your hand up. I think it was third though. I think Hal and Judy were in front of me. OK, now I see the whole screen. So Hal, Tim, and then Judy. So I'd like to amplify the message shared by Steve and Lynn about engaging diverse constituencies of end users early. One only other comment is that I hope that the broad term tools encompasses very careful consideration of the user interface and making sure it's able to engage medical users or less technology driven users to enter the space to start to explore the capabilities and to not get turned off early on. So I'm wondering whether artists, data architects, even psychologists might have a place at conceiving what the user interface might look like. Yeah, that's a good point. And this was discussed at some length at the workshop. Workshop attendees were not sure they had any easy solutions, but they certainly were thinking about accommodating different levels of sophistication of use or of detailed use and also different parts of the genomics community, including having different views for different communities. Okay, Tim, and then Judy. Sure. So, yeah, thank you so much. I really appreciate the program. And I've been sort of tuned in as it's been developing over the past several years. I've been quite impressed by the progress. One sort of a minor question, but I notice you're sort of thinking through and maybe this echoes a little bit of what Steve was talking about as well, that limiting the tools to informatics might exclude some creative ideas that are maybe less informatic, maybe more experimental or more sort of lab-based applications and uses that could broadly increase the impact of the pan genome. And I was wondering what you would think about maybe potentially expanding the scope of that or maybe limiting that a little bit less narrowly to just informatics. Tim, can you say a little bit more what you mean by lab-based assays that? Well, I don't want to be too specific because I think the community could be quite creative, but I could imagine maybe there are ways to sort of manipulate pan genomes using CRISPR-type technologies that involve reagents or to specifically target certain populations that are studied in the pan genome. Or I could imagine ways to synthesize DNA from pan genome individuals. I mean, these are super broad ideas, but I can imagine the community could have a lot of creative thoughts on ways that go beyond what might be narrowly construed as informatics that could actually broaden the use of the pan genome. Yeah, so we could. I worry about staying within budget and diluting things too much. And I also think, again, it seems like a lot of these would be potentially responsive just to the regular NHGRI portfolio. So I do think there would be room for that kind of thing. I acknowledge that at first it might not seem compelling to people who haven't really started to use the pan genome. And that's going to happen unevenly. And then they might be at a disadvantage or something like that. It might be hard to get those funded. But over time, I think that will be less and less of a problem. Thank you. Judy, go ahead. Yeah, I'd like to just generally endorse the distribution of limited versus open competition for the tool's development. And that's going to be very exciting. So my comment or suggestion is about the latter one in terms of the tool development. Really, the FOA really has to include the fair principles of accessibility standards, interoperability, and even kind of sharing it in open access platforms like GitHub. I think there's a real opportunity for the community and the applicants to be very creative here. But you really want to enforce early sharing modularity. I agree with Hal's comment about you want to be creative on the back end as well as the front end to really enhance utility. Thank you. Any other questions or comments from Council? I have one. I completely agree with all the comments. This is Olga, sorry, with all the comments that were already made. And just to add to Judy's comment about tool development, I think another aspect that would be really important is evaluation. So having some systematic plan for how to evaluate, like I mean as simple so that we eventually come to a point where we might actually have a standard way of calling variants so somebody could download a set of variants with a certain diversity criteria that matches their cohort. That would be the amazing goal that they don't have to do this. They can just do it from this pan genome project. I think that will be transformative. But evaluation is important. So it's not just a novel tool. It's actually one that's better or just an evaluation of a better implementation of an existing tool. OK, Gerald, can I see the whole screen? Thank you. Any final comments? So council members, are you comfortable voting on all three FOAs? OK, can I get a motion to accept the concept and second? All in favor? Anyone opposed or abstaining? Great. Thank you. Thank you, Adam. Thank you all. All right, you deserve a break. Let's go to 420 and resume then. 420 East Coast time. Thank you.