 So now I'd like to start us off with a illustrious researcher with a remarkable career in comparative and functional genomics. So Harris Lewin, I am going to read this to make sure I get it right, is the... Can you hear me now? Okay. So Harris Lewin is the Robert and Roosevelt Osborne Endowed Chair and Distinguished Professor of Evolution and Ecology at UC Davis in his latest carnation of his career. He is also the chair of the Earth Bio Genome Project and a member of the National Academy. And as I said, he is probably known to everybody here because he is one of the guiding lights in this field, so we're very lucky to have Harris. Well, I'd like to thank Jen for that very nice introduction. And I'd like to thank the organizers for inviting me to the workshop and specifically for the honor of allowing me to give the keynote address to this group. My assignment from Jen and the organizers was to help to set the stage for the workshop to provide a very broad view of comparative genomics across all of life's domains with an emphasis on forward thinking for the development of NHGRI's strategic next 10-year strategic plan. Before moving forward, I'd like to take a brief historical dive into the past to see how we got to where we are today. So the method of comparison, we just heard this very nicely from Adele, the method of comparison has been a very critical methodology to philosophers, to biologists, and to medicine for more than 2,000 years. And of course, Aristotle in his very early classification system for life on Earth recognized that before you can understand the differences between species, what makes this individual organisms different, you have to be able to group them by their commonalities. And that was really the roots of 2,000 years of what then in the early 1700s, Linnaean used the same system of comparison to create the Linnaean system of taxonomy, which we are still using 300 years later today. Of course, Zuma had another 150 years and Darwin, of course, used the power of comparison both within species and between species in his exposition of the theory of evolution by natural selection. And more recently, maybe 150 years from Darwin, we have the first example of the use of genetic techniques by Carl Woze, who compared 16S ribosomal DNA sequences, those DNA sequences encoding the 16S ribosomal RNA subunit, and was able to forever change our view of the evolution and history of life on Earth by recognizing that what were then called the archaebacteria were not bacteria at all. In fact, they were a separate domain of life, the third domain of life, distinguishing them clearly from the bacteria and the eukaryotes. And so these very powerful tools of molecular phylogenetics across the different groups of organisms shed great light on our understanding of the history and evolution of life on Earth. But the term genomics, this is the mid-70s. The term genomics wasn't even invented until 10 years later. And those of you who've been around long enough remember that this term was put out by, first suggested by Tom Roderick, a biologist at the Jackson Laboratory at a pub not far from here in Bethesda. And this was written up, this story, in beer Bethesda in biology, on how genomics came to being. Genomics as a field, as a discipline, was born not far from here. And the early practitioners of the field of genomics, particularly comparative genomics, were the mappers, the comparative mappers. And we're very fortunate that some of those folks are still around with us and still very active. I didn't know my friend Steve O'Brien would be here today. But that's his picture there, and that's the real thing right there, one of our founders. And we're so fortunate to have people like Steve and Jim Womack, again, those of you at USDA will remember Jim, and people like Jenny Graves in Australia, studying the weird mammals of Australia and trying to understand the evolution of sex through the study of the evolution of sex chromosomes. So Steve with his cats, and Jim with his cows, and Jenny with her weird animals, this was really the beginning in 1984. And that's about when I came along, little Harris, as Steve used to call me in those days. And I heard a talk, I was trained as an immunogeneticist and I was working on the genetic control of disease resistance. And I heard a talk by Jim Womack in 1985, and Jim put up this slide that showed a comparison of the organization of the cow, the human, and the mouse genomes. And the markers in those days were biochemical markers and serological markers. There weren't really molecular markers in those days. But I became completely, you know, enthralled with this idea because I was working on the Majestic Compatibility Complex and I just thought, you know, a cow was the MHC with another, you know, and others thought that a mouse was the chromosome 17 with hair. I mean, we had a very limited view of comparative genetics because we had very few loci that we can study in those days. And in that interest, you know, that interest at that time eventually over the period of the next 10 or 15 years led to my own interest in comparative genomics and evolution as we recognized that these same techniques we were using in comparative genetics could be used across the entire genome. And it was only, we believed a matter of time before markers went from, maps went with markers of tens and 20 organism, 20 markers to hundreds if not thousands of markers, not even anticipating the kinds of things that we think about today. We just didn't even think it was possible. But after 30 years, my love affair with chromosomes is still going on and I'm still as passionate about chromosome evolution as I was when we began 30 years ago. So what is comparative genomics? What is this new field? This new field is a field of biological research in which genomic features of different organisms are compared. The genomic features may include DNA sequences, genes, gene order, regulatory sequences and other genomic structural landmarks such as three-dimensional structure. I think you can find this definition on Google. It's there. The field of comparative genomics is absolutely exploded. If you go back the most recent publication to summarize this was Mulligan in 2004, I happened to do just a little search before I came over to see what has happened since 2014, I said 2004, 2014, and what we see now is over 4,000 titles and abstracts with the word genomics in it and more than 650 titles and abstracts with the words comparative genomics. So in the last six years, we have had more than doubling of the scientific output in this very new discipline. And this is a good sign. This field is really expanding logarithmically and building or linearly and bringing in many talented new investigators. And so comparative genomics can be grouped into sort of three broad areas, interspecies such as comparing of maps as we talked about a moment ago, of sequences, identification of conserved regulatory elements, introspecies comparative genomics or within a species such as identification of population level genetic variation using SNPs and GWAS studies, within individuals such as the identification of cancer genomes, within individual patients, and even whole ecosystems, the identification of organisms in complex ecosystems and how they respond to their changes in their environment. And many of you, those in the room today, have been very important developers of the tools that we use for comparative genomics which span everything from DNA sequence comparison tools to annotation tools for annotating genes and functional annotation of genes, evolutionary and population genomics tools for doing epigenetic analysis and epigenomic analysis, tools for understanding the microbe and host microbe interactions, as well as the data visualization tools which and analysis tools which have become so critical to our field as the data have been rapidly expanding. Genomics and comparative genomics has contributed tremendously to important applications as well. This is not just a fundamental scientific discipline but is a discipline that is yielded important practical results in agriculture and in medicine and environmental sciences and in conservation. And I won't go through all of those things. This is well known to most of you, genetic diagnosis, identification of genes responsible for diseases and complex traits and this spans across all really of the important, economically important eukaryotic taxa, species of plants, animals in particular. So I won't go through all of these things. These are well known to most of you in the room. But what we should realize is all of these advances that we've had in the past 20 years has been accomplished with the knowledge of the genomic biology of very, very few organisms. If we look across the entire universe of known eukaryotic organisms, about one and a half million species, if you go to GenBank you will find genome sequences for only about now about 4,000 and 2018 about 3,300 unique species. And what is absolutely shocking is if you parse these into taxonomic classification, yes, we have a representative from each of the three domains of life. Yes, we have a representative species sequence from each of the five, if you will, major kingdoms of life. But once you get past the kingdoms, even at these higher taxonomic levels, such as phylum, 37 of the 71 phylum, we don't even know how many phyla there are. But 37 of the 71, only 52 percent, have been sampled as a representative organization at the phylum level. And it gets worse after that. At the class level, 41.5 percent. If you look in the, in the chordates, chordates is a phylum and you take our own lineage that would be equivalent to a mammal, 41 percent only sampled, 31 percent of the orders, 10 percent of the families, 0.89 percent of the genera and 0.22 percent of the species. This is rather humbling, I think, for most people and when we think about the future, we must think about what we're going to be able to learn if we understand and if we sample the blueprints of life from the remaining 99.8 percent of the known eukaryotes. And remember what we know is only 10 percent, perhaps, of what's there. So you can multiply that by 0.1. And that's probably the representation that we have for known life on the planet. It is, we are just at the beginning of this journey. And then you have 3,311 sequences. Well, how good are those sequences? You know, these are the genome sequences. That's the next question you're going to ask. And these data are summarized, we're summarized together, myself, John Coddington and Mike Trizna at the Smithsonian Institution. And I'll just explain a little bit of what this diagram shows, if I can figure out this little laser pointer here, yeah. Okay, so this bottom number here corresponds to sort of the level of saying these are contig assemblies, the twos are all scaffold assemblies, the threes are all chromosome assemblies, and four are complete genomes. The next number is the log, okay, is the log of the contig N50, okay. So a value of 6 here means a contig N50 of a million base pairs. The next number is the log of the scaffold N50. So a log there of 7 would be 10 million base pairs. So if you look here where the majority of sequences across all of life, first thing you'll notice is like half of what's been sequenced are fungi, okay. And then animals of all types, not just vertebrates, and then the plants and produce, and I love the other category, which is really the biggest category of eukaryotic life, and there's 30 of them there. And then we look at the distribution here of these genome quality metrics, metric if you would like to call it. And you could see the vast majority of sequences bunch up here in this group in the middle, which have contig N50s of somewhere around between 10,000 bases and 100,000 bases and scaffold N50s between 100,000 and a million. And if you go out here to the right, which are the really good high quality genomes, you'll see most of those are fungi again, and very few of them are plants or animals. And I want to point this out because Eric Jarvis is going to talk about this, I'm sure. The metric for the vertebrate genomes project, the newest and best genomes that we're able to produce are right here. This is the metric, chromosome assignments, contig N50s of a million, scaffold N50s of 10 million or greater. And the contribution just last year of 14, just in the last year of 14 genomes of this quality is quite a contribution really to our overall, you know, overall knowledge and collection of very high quality reference genomes, which is what you would consider here. So as we move through the next two days, we need to ask ourselves the following questions. First of all, what should we do when we are thinking about producing reference genomes? Do we produce reference genomes to ask a specific biological question? Or do we produce reference genomes to ask those questions and in addition questions that we have not even thought of yet? And I think I'll argue, and I'm sure that Eric's going to argue, that we should do the latter. Do the best we can now, do it once, and you don't have to do it again. Okay. And so what enables then comparative genomes, we need more genomes, you know? We don't have very many to compare. We need more genomes and what enables comparative genomics is high quality genomes to compare and the tools to compare them. And so I'm going to spend a few minutes because this is really the thinking that led to the Earth Biogenome Project. And you know, it was, it's clear that having more genomes, better genomes, you're going to be able to address a broad set of scientific health and environmental and conservation issues. And the idea for this project arose about four years ago at, and the first time it was discussed was at a meeting at the Smithsonian Institution, several of you were there, I know Paula Mayby from who was at NSF, at the time she was there representing NSF. And there were, Lakshmi I think was there as well for USDA. And we had a very intense discussion. We presented this idea for the first time and everybody felt in 2015 that it was a good idea to explore this idea further. Two years later, the project was sort of framed in this way as a grand challenge. The Earth Biogenome Project, what it is, is a confederate confederated network of networks that has a common goal of sequencing and annotating the genomes of all 1.5 million known species in a period of 10 years. Not very ambitious, but this is what we have proposed. And we have outlined this in a paper published in 2018, in April of 2018, in PNAS, Pam Soltis is here, Eric several of the authors of this article are in the room. And this perspective article laid out the goals, the challenges, and the potential benefits of this project. And since that time, the response really from the scientific community has been overwhelmingly positive. Very much unlike what happened with the human genome at the beginning, we only get your nuts or something like that, or it's overly ambitious, but we rarely get criticized in saying that the value of a project like this to science and to the world would not be worth the cost. And so a little bit about where we are because if we need genomes, we have to be organized to produce them. The project officially launched last year at the Wellcome Trust in London, I think unless you've been hiding under a rock, you haven't heard about this. Extremely wide media coverage endorsed in nature and even in the times of London. The project really got a boost in moving forward when the under the leadership of Mike Stratton at the Wellcome Trust Sanger Institute committed 50 million pounds or approximately $65 million to producing genomes at scale over a five-year period to organize this project. They've already hired their director, Mark Blackster, a very prominent evolutionary genomicist from the University of Edinburgh, and they're seeking another $100 million with the goal of sequencing all 66,000 species that have been found in the United Kingdom. That's the stated goal of the Darwin Tree of Life project. An organized national project to sequence the entire biodiversity in the United Kingdom. And they're about to announce further commitments to this project. There are also ongoing national projects in China and Chile. Has their 1000 Genomes Project, India, Korea, Australia, where there are Oz mammals and there are plant genomics in Canada, where there's 150, can 150. This is starting to get going and it's becoming a global movement. Something I think that's been mildly disappointing to many of us in the United States and it's in large part due to the way science is funded in the United States, and I'm going to talk about this at the end, is that there is no comprehensive national project as there is now in several other countries. Just two weeks ago it was announced in California that we have a $10 million program called the California Conservation Genomics Program, which will contribute at least 100 genomes to the Earth Biogenome Project. This is headed by Brad Schaefer at UCLA at the Lerkratz Center there. And the USDA has a very, very nice project for sequencing 100-ag pests. In terms of large-scale sequencing of eukaryotes in a coordinated fashion, there's not much and we aim to address that. The project has 25 participating institutions that have signed the Memorandum of Understanding in 13 countries that commit these institutions to the principles of the Earth Biogenome Project as outlined in the Memorandum of Understanding. For open data access and sharing and commitment to the principles of the Convention on Biodiversity Nagoya Protocols for Access and Benefit Sharing, there are over 20 communities that have now voluntarily affiliated with the Earth Biogenome Project, many of them represented here today. And this has been really important to us because some of these communities, as you'll hear from Eric's talk, have been really leading the way in producing high-quality reference genomes. That's the vertebrate genomes project, which I've been a part of since 2009 when the project was founded. And how would we go about doing this? You can't do it all at once. Again, I mentioned this was over a 10-year period. And the first strategy we call it the phylogenomic wave. And the proposal is to get the framework backbone, the outline of the Tree of the Life by first sequencing all of the known 9,300 known eukaryotic families, one representative of each of the eukaryotic families, to reference quality in the first three years. In phase two, it would be 200,000, approximately 200,000 genera. In phase three, it would be all known 1.5 million species. And if you want to have an idea of what that really scales to and how achievable and doable phase one is today with the technology that we have, that would be 60 genomes a week worldwide, 60 genomes a week worldwide over a three-year period. If there were 10 laboratories around the world, that would be six genomes a week. And Eric tells me that this is about exactly what he's being able to produce right now. And so this is clearly scalable. Phase one is achievable. In the United States alone, we have somewhere between 2,000 to 3,000 of the families represented. We could do a third of all the families, just with species here in the United States. Very achievable. Phase two and phase three require more genomes, 865 a week, and eventually to phase three in the out years, probably sequencing technology in year seven through ten that we don't even have or can imagine maybe today. That would be 9,000 genomes a week, which with instruments like ONT and the new pack biotechnology may not be that far fetched if we're talking about high quality draft processes and not reference genome quality for every species. We're talking about reference genome quality just at the family level. And that's the wave. Did I not have the right slide up? I don't know. Okay. And the second strategy, complementary strategy, and to be done in parallel is what we call the Google Life Strategy or sequencing of entire ecosystems, which is essentially of all organisms in a particular geographical area. For example, within biodiversity hotspots, everything in the soil, land, and water. This is already being done to some extent with some of the EDNA technologies, but anybody who's using EDNA technology knows that if you go out and sequence, at least half of the thing, half of the reeds don't map to anything. And so until you have the organisms identified, you really are shooting in the dark with these kinds of studies and trying to understand complex ecosystems. And so technologies can easily be adopted once you have the sequence for being able to assay at very low cost entire ecosystems over periods of time to understand then the impacts of environmental change on biodiversity and eventually to produce a multi-dimensional dynamic view of all life on Earth. And this strategy is being employed by the Darwin Tree of Life program, and it is part of what we're planning to do in the California Conservation Genomics Program. So what's all this going to cost? I swallowed hard there, but it's really not too bad. It's $4.7 billion. And these numbers were rigorously tested. This was two years ago, $4.7 billion over 10 years. And, well, how does that compare to the Human Genome Project? If you look at what it cost in about 1990 dollars when the Human Genome Project was initiated, that's about $3 billion. Today, or in 2012 dollars, is $5.4 billion. Today, this is probably close to $6 billion. And we had pretty firm estimates less than two years ago that I think have been very well tested that all million and a half known eukaryotes can be sequenced at a cost of under $5 billion. Now, the initial calculation I did on the way back from a flight in 2014 was about 2.7 when we added all the costs in. It turned out to be right. And at that cost, we're going to get better genomes. Better genomes than we initially anticipated because the cost of long reads is going down. And the return on the investment to the United States economy alone, as outlined in the Battelle report, was a trillion dollars. And if that's what returned from one species, what's going to be the return on investment for another million and a half species? It's going to be likely to be many times more. But it's not, you know, just about the money. It's about our understanding and the rules of life as was just talked about. And so this is where we are. $4.7 billion, it is a lot. I don't know. The Lodge Hadron Collider has a budget of $1 billion a year. It costs $13.2 billion to identify the Higgs boson. And I would argue that the value of knowledge that we're going to gain from a million and a half species to the average person on this planet is going to be far greater than what we have from some of these very large science projects, both very fundamental knowledge but very little applied knowledge. And so this type of project is going to, no doubt, it's going to happen. It's going to happen sooner. It's going to happen later. But it's not going to happen if we don't have a systematic way of doing it. But what happens when we come together across agencies and across projects is we have the ability to completely revolutionize the life sciences. We have right now the loss of 50% of the vertebrate population in the world in the last 40 years, 27,000 species that are on the threatened and endangered list no matter what they do to the legislation for the Endangered Species Act. It's very important that we move on this because by the end of the century it's predicted that we're going to lose half the biodiversity on Earth if we don't do something about it. And having a repository, a digital library of life and making sure that we have a representation of every species that we have on this planet at this point in time, the most biodiverse period in the history of this planet would be a huge mistake to pass on to our descendants. We need to do this. But out there lies new resources for the improvement of plant and animals in plant and animal agriculture. New sources of genetic variation that have yet to be exploited. New drugs to be discovered. 80% of the pharmaceuticals that we consume today in the world have its origin in a natural product, 80%. And finally, the value of basic science. Most of you are familiar with the Linnaean taxonomic view of a hierarchical tree of life, but this is our current view of the eukaryotic tree of life. And for those of you who want to be humbled, you are here, right there. That's humans in this vast new way of looking at eukaryotic life in these five supergroups in which we're more closely related to fungi than we are to many of the other eukaryotic life forms that he've evolved over the last two billion years. Two billion years of life. And the value here is going to be enormous. Let me see. So in the last few minutes, I just want to give you a few examples from my own research on how chromosome-scale assemblies can assist in understanding very deep and complex problems in evolution. And the area that I've been interested in is chromosome evolution. And this field was really dead, which is why I went into administration for five years, because there just weren't enough genomes for us to analyze. Chromosome evolution is a process by which chromosomes change in the order and orientation of genes, regulatory elements, and three-dimensional structure over time. This is a paper that we published in 2019 that schematizes sort of the main discoveries. First thing you need are these chromosome-scale assemblies so that you can define the lineage-specific breakpoints in chromosome evolution over time. We discovered that the evolutionary breakpoint regions that flank the boundaries of the rearrangements are enriched in lineage-specific transposable elements. Lineage-specific transposable elements contain more transcription factor binding sites, and these transcription factor binding sites, if there are more of them, lead to higher affinity for transcription factors, all shown here. And those lead to stronger enhancers, and these stronger enhancers can lead to the change in gene expression. And some of these changes in gene expression are no doubt related to adaptation and speciation, as has been known for some time. And so what we really need, the framework for doing all of this kind of work, which was published in Genome Research, is having a really good understanding of the comparative organization of chromosomes across evolutionary taxes. So, back in 2015, we first set out to do a much more extensive computational reconstruction of the ancestral genome of all placental mammals. The most recent one before that was 2005, which we published with Mr. O'Brien there, and has then post-doc Bill Murphy, and we had a handful of genomes. We only had three sequences, and we learned, we figured out how to integrate maps and sequences to look at, you know, do a reconstruction. And we reconstructed the Boreo-Eutherian ancestor, but with only about 50% coverage related to the human genome. With the genomes that we have available, I'll talk about it in a moment. Our goal was to define all of the evolutionary breakpoint regions, conserved blocks along the lineage, Eutherian lineage leading to primates. And this is an area that we now call Paleogenomics. And again, in setting the stage, this is an area I think that Beth Shapiro will cover tomorrow. Now, the way this study was done was using 19 Eutherian species representing 10 orders. We didn't even have four of the major, all four of the major super orders represented. We had three of the four. And for the first time, we learned how to use scaffold-based assemblies. There were seven of those, and two out-group species. And the descrambler algorithm, which we developed and described, was developed by J. Boom Kim, and described in this paper was applied to this dataset. And we reconstructed seven ancestors, including the Eutherian ancestor, the ancestor of all Eutherian mammals, and then six other ancestral genomes leading to the human genome. And the kinds of information that you can get from a reconstruction like this is that you can define all of the fusions and the fusions and inversions and complex rearrangements along the 105 million-year evolutionary time span between the Eutherian ancestor and the human and other Eutherian mammals. And as you can see from this table, we could define eight fusions, 12 fusions, 59 inversions, 38 complex rearrangements for a total of 117 rearrangements over a period of 105 million years. That's about one rearrangement every million years. You think about the number of myosis. So large-scale rearrangements are quite common, but stable rearrangements are very rare. And we're very interested in the rates of rearrangements, and as you can see, the rates differ. This confirmed our 2005 study early before the K-PG boundary. The rates of chromosome evolution were very low. And then after the gritaceous paleogene boundary, after the asteroid impact, it seemed that chromosome evolution speeded up. We showed this in the murid lineages and confirmed this in the primate lineages in this study. And so why is this important? Why would this be important to NHGRI? The first example of a genetic mutation associated with a human cancer was a chromosome rearrangement. 1959, the Philadelphia chromosome was discovered. Janet Rowley, in the mid-70s, was the first to demonstrate that the etiology of these small chromosome 22s was actually a reciprocal translocation between chromosome 9 and chromosome 22. Later, it was shown that this translocation produced a fusion of the V-able gene and the BCL gene, producing a tyrosine kinase signaling molecule that was constitutively expressed in all cells that had this particular rearrangement. And that tyrosine kinase could be shown to directly transform cells. This rearrangement is found of 95% of all patients with CML, chronic illocytic leukemia. And what we showed in 2005 was that this breakpoint, in fact, on chromosome 22 was an evolutionary breakpoint region, a region that we could define as a breakpoint as far back as the ancestor of the U-archontagliar ancestor. And so, here it shows the examples of good annotation, high-quality genome sequences, in being able to unravel a very important problem. And that's not the end of the story. This beautiful story is that, as a result of this knowledge of what the fusion protein was, a drug was developed, GLEVAC. So, this is a complete story from rearrangement to drug to treatment, and a very effective treatment, indeed, for the treatment of CML and other cancers of the hematolymphoid system. And, yeah, she's gonna... Yes. And so, I'm going to just wrap up in the next few minutes to say, we've now expanded these studies. The 200 Mammals project has been incredibly important to us. We're collaborating with Eleanor here and Kirsten, Bruce, the folks at the Broad Institute. And our chromosome evolution collaborative, getting a representative species from each of the 19 Euthyrian orders plus representatives of the marsupials, the metatherians, and the platypus, the protherians. And we're doing a complete reconstruction, the first reconstruction of a mammal genome. And when I saw this just last week, unpublished data, the only unpublished data I'm showing you, I practically cried because I've been waiting for 20 years to see something like this. This is an evolution highway image showing an overlay of 30 different chromosome scale assemblies on the reconstructed mammalian ancestral chromosome 8a. And to me, this is one of the most beautiful things. I don't have time to get thrown off to explain all this to you. But in this beautiful mosaic of genome evolution, you can see so many interesting and important features. Features that have been conserved when you compare to the outgroups, the chicken, and you know, the most recent common ancestor of the Euthyrians, of all the mammals, 312 million years ago, synthetic relationships that have been conserved over vast evolutionary time scales. And in certain lineages, rearrangements are speeding up, and other lineages, these arrangements are stable over hundreds of millions of years of evolutionary time. So I just want to wrap it up and say, you've given us our charge, Jen, identify the areas of synergy across all taxa, identify gaps in knowledge and resources and recommend areas that we should focus on. I've made a list of things that I think would be cool to talk about over the next few days, especially the areas of synergy across all taxa. I think these things are important, no matter what tax you work on, high-quality reference genomes, functional annotation, things that why June is doing with FANG, functional annotation across taxa. How nutrition affects, yes. It is time to get off. Host defense, wellness and disease, all of these things are across taxa, and gaps in our knowledge, how 3D genome structure affects genome evolution and function I'm working with. There is Lieberman who's in the crowd on that problem. Many different problems that are important no matter what group of organisms you're working in. And, oops, I went the wrong way. And finally, we'll leave it for the last few days to work out strategies for NIH. But I just want to say this last thing, is that what we really need from NHGRI and from NSF and from the representatives here is to have strong interagency support and coordination of funding for a national large-scale genomics initiative. We are being left behind by our colleagues around the world and many of our scientific competitors. And it's because of this mission. If I go to NIH, you'll sequence anything that kills you. DoSDA will sequence anything you can eat. DOE will sequence anything that you can create energy from. But there are lots of organisms in between. And, you know, NSF really does have this broad mandate. And so we looked at NSF, really. But NSF is a hypothesis-driven type research organization. And what we're talking about with Earth Biogenome Project is an infrastructure project. An infrastructure for the future of biology. And until we start thinking about it as an infrastructure project, it's going to be very difficult to get this kind of project funding. So that's my plea. We are going to the moon, and we're not going to stop halfway. People say, well, why do you need to do all of them? Maybe evolution happens at the species level, not at the genus level or the family level. You need to go all the way. And we can talk about that, not halfway. And we're going to do it. It's not going to be easy. We don't have all the answers. But we're well on our way. And so now is the time to really think about this. And to go forth and sequence, and to annotate as well. And to do all the things that we're here over the next three days to discuss. Thank you very much. Really hard person to kick off. So.