 Dr. Green, can we start with you and would you give us a brief overview of the vision of this general initiative and, you know, to obtain the whole genome sequences of more than a 10,000 vertebrate species? What's this about? So what it's about is really trying to bring genomics to the study of evolutionary biologists. The fact is evolutionary studies done by evolutionary biologists or naturalists or paleontologists have for years mostly focused on morphological features. In some cases maybe physiological features of organisms and try to trace the evolutionary relationships. What contemporary genomics has taught us is that the information that has been driving evolutionary processes are all coded in the genomes of all these living organisms. And as we have learned how to explore genomes by mapping them and more recently sequencing them, it has given us great insights about evolution. The idea of this project is to basically start to open up evolution's notebooks by actually sequencing the genomes of many, many species in an attempt to provide a new way of doing evolutionary type studies using genomics. Dr. Pelza, we've been, the National Human Genome Research Institute has been doing model organism research for some time. Could you tell us a little bit about what we've been doing and how this would fit into that? The National Human Genome Research Institute has been very interested in particular organisms, the sequences of particular organisms, mostly for biomedical research reasons. So the easiest to, the first that we look at are organisms that are used extensively in biomedical research, for example mouse and there are others. In addition, there are questions that are more generally biologically, to provide more general biological insight as Eric mentioned, to be able to compare genomes between species. By that comparison, you can find regions of the genome that are conserved and you can make inferences about their function. Dr. Green, this project seems fairly ambitious. This is a large number of species that you're interested in doing that. What makes it possible to do this now? Is there some change in technology, cost structure? Let's set a context. So, so far, we've, as a community, have sequenced, maybe, some of the order of 30 to 40 vertebrate genomes, if we just focus on vertebrates, which is pretty much what this initiative is all about. And those have been selected because they've had the closest relationship to the kinds of things that the National Institutes of Health in particular would be interested in. But that's just barely opening up the window of opportunity for these types of genomic studies. We can imagine a day where the kinds of questions that you'd really want to ask to trace the evolutionary origins of all types of biological innovations across the vertebrate phylogenetic tree would require many more species genomes to be sequenced. What has occurred in the last five years has been some spectacular advances in DNA sequencing technologies that are rapidly driving down the cost of sequencing a new genome. It turns out that for these studies, and ironically even the first 30 to 40 genomes that we've sequenced of other vertebrates, that what can be very rate limiting is picking the species, getting the DNA, and making sure you know something about the exact species that you picked and some details about it, and so forth. And if it was limiting for the first 30 to 40, when you start thinking about getting the DNA in hand ready to go for thousands, it was recognized that we should start now. And Dr. Pelza, help me set the context a little bit more, because we're not actually talking about sequencing right now. What's this an idea that you guys are kicking around? When a few people were doing sequencing, when sequencing was very expensive, it was relatively easy to coordinate. When there are now going to be many people who are able to turn out a high quality genome sequence, coordination has to be accomplished if only to avoid duplication of effort, even if it's cheap it's going to be useful to avoid duplication of effort. In addition, there are all the benefits of understanding data quality, understanding sample quality, so people don't have to fiddle around with the samples too much before they start to sequence, coming to agreements about depositing the data so that everybody can work on the data as a community. And all those things that were served by having sequencing done previously in just a few centers can also be served by having a well organized front end if you like to a very large sequencing effort. Let me so clarify that you guys are not, NHGRI is not launching this project and this is not something that we're automatically doing. I know sitting here because you're co-authors of a paper about the idea of doing this and that you're part of the scientific community and this is a discussion about how this might go forward. And you raised a really interesting question about how would this be organized? How would this be organized? This paper is coming out of a workshop that a set of colleagues in the field, they're not even all genomics people. They're also including a zoologist and someone who studies conservation biology or conservation of college. Mostly not genomics people. They convened a meeting of folks that included sort of genomics geeks like us that had included some computational biologists and evolutionary biologists, a whole very diverse multidisciplinary group, all who recognized that wouldn't it be cool if 10 or 20 years from now you could go to the internet and access genome sequences of 10 to 15,000 vertebrates. This would create a whole new way of doing evolutionary studies. They realized to get that going you got to get organized, but you also need sort of an interdisciplinary team to be thinking about it. This was the first organizational meeting for this initiative and coming out of it was a plan to move forward to start to identify the species, collect the DNA, get it ready for the sequences. Is there a vision of what that organization might look like? At one extreme you could imagine this eventually being done as a centralized effort that whips out 10,000 or more genomes. I don't actually think it's going to look like that. What I think it's going to be is the actual sequencing is eventually going to be more dispersed. And I think there's a really good reason to do that. First of all people will be able, because of the technology changes, people will be able to do that. And I think it brings many more people into the community of those who really think hard about these data and it will ultimately result in more different kinds of insights coming from the same data. Although initially it might be that it's centralized at the organizational level, maybe even collecting the DNA, making the repository, trying to get some consistent quality and so forth and then maybe dispersed out for the actual sequencing. Or a little, it could even be a mixed model. How much will something like this cost to do it? And what would be the funding mechanisms to support this kind of effort? I think it's premature to talk about, if by this effort you mean the actual sequencing, I think it's premature to talk about. I think the organizational steps are actually probably not going to be expensive because they'll take a person or two and some computational facility to organize everything into a collection that people can look at. And a lot of lead work in coordinating with those individuals who actually have done the hard work already of collecting these samples and will hopefully be doing ongoing work and collecting more samples in the routine course of doing their research. How long would something like this take? The whole thing is unclear in part because we don't precisely know what the curve will look like for DNA sequencing costs and issues like that. Don't totally know where the money is going to come from to completely do it. But in terms of getting out of the gate and making decisions, I mean there are preliminary decisions already have been made describing the paper about what species to start thinking about. Some people already have the DNAs in hand and now can start to contribute them to a central collection. And there is possibilities of people now moving forward and collecting additional, getting in hand additional DNAs to contribute to this. So I think it's already starting. And I think it's realistic to imagine that a good collection of these DNAs could happen in a small number of years. How many species are we talking about here? Well, in the paper they're proposing 16,000 total. That's a pretty good representation of the 60,000, roughly 60,000 known vertebrate species. What would Darwin have said if he could see the notebooks that you guys are about to create? I think he would have said, told yourself. Told yourself. He had it right. I mean Darwin had it right. Genomes, that word had never been invented, did know about DNA. He knew something was going on. He just didn't know where all that information was being coded. And he knew it was somewhere. And he would look at the G's, A's, T's and C's. He would see the changes. He would be able to follow it and he'd say, I knew it was somewhere. I just couldn't figure out where. And I think that he would be telling us what to sequence next. All right.