 As difficult as it was to find all of the information of the genome, to sequence the genome, this next step of figuring out what all the information is in the genome is a much more complex problem. We're just beginning to understand how this works. Our genome is very, very large and we're trying to figure out what are the parts of that make a difference in our lives, in our susceptibility to disease, and maybe even impact our behaviors in some ways. The fundamental question we're trying to answer is what does all the genome do? What are all the different functional elements? The whole purpose of the ENCODE project is to try to figure out which base pairs in the human genome are important. And that's a big question because there are a lot of base pairs and the important ones are not all in one little package they're spread throughout. The human genome when it was finished were the letters. What ENCODE is saying is, aha, I think this is a word, this is a word, this is a word, this is a word. If we could understand that, then we could start to figure out who's really likely to come down with type 2 diabetes, who's likely to get lung cancer, who should be making changes in their lifestyles to try to prevent this, who's going to be most responsive to certain drugs. There's on the order of 5% of the genome, one twentieth that you can ascribe function to, but that still leaves a large fraction of the genome, maybe 95%, then embarrassingly we don't know what it does. And what ENCODE is able to do is to attribute function to a larger fraction of the genome than we were able to see before. A big part of the project is to try to figure out where all the proteins that interact with DNA where they're bound in a particular cell at a particular time during a particular process. So you basically freeze the proteins onto the place in the DNA in the nucleus while the cells still alive and then analyze it afterwards. You can get antibodies against those proteins and we can purify out that subset of the genome that has a special property and we get those molecules, we put them into the sequencers, map them back to the genome and figure out where these particular chemical processes are happening. This is just amazing. So it really gives you just a blueprint, where are you landing and what genes are nearby, what tissues might that be active in, so it really is a starting point. Very little of our genomes are junk. 80% of the genome is engaged in at least one biochemical activity. For a large fraction of the genome, not now 5%, but 80% of the genome, we can say we know that it does something. This metaphor about junk DNA has become I think very entrenched, it's been entrenched publicly and entrenched scientifically. And ENCODE totally challenges that, but we just don't have big, blank, boring bits of the genome. All the genome is alive at some level. What ENCODE does is it provides a general catalog of how the genome functions, and so what they can do is they can take their disease tissue, in the case of a diabetes researcher, it would be a pancreas or a muscle cell in a diabetes patient, and they can ask what genes are expressed, and then they can look to see how genes are misregulated in these tissues. And then they can start going back to our catalog and say, okay, what are the elements, ENCODE elements in front of these genes? We can take a disease like Crane's disease, which is a pretty bad disease of the gut, and say to the people studying it, well, have you thought about this particular mechanism or this? And sometimes they have, and sometimes it's a bit of a surprise, and they say, oh, I'm a bit interested in that, let me go off and have a look. But we do that comprehensively for hundreds of diseases alongside hundreds of different mechanisms, and that's been quite exciting. So there's 2,000 DNA binding proteins in the genome. We looked at about 100 of those, 116 of those. So there's a long way to go yet. There's a lot more of these guys to study. By having that larger store of information, I think that will accelerate the pace of healthcare. The better that we understand it, the more of a chance that we can intervene in the system in a beneficial way.