 So I was asked to give a summary of the meeting. And I thought about giving a summary to the meeting. And I said, yeah, I could do that. And a summary to the meeting is really useful if the talks are really bad and nobody understands them. And then you have an opportunity to help out. But the talks have been actually really good. And I anticipated that they might be good. And as a content, many of them are in areas that I don't actually do anything in. And so I decided up front that I would try and experiment on myself. And this experiment on myself was that I am obliged, not as often as Francis or the head of NHGRI is, but often enough to give talks about what has the genome done for us. And I have one which is sort of updated from time to time, which is called this, the fruits of the genome for society. And in it, I try to touch, because it's really intended for relatively general audiences, I try to touch on all of the issues. And I wondered how that would work out if I just annotated what I normally do with what we heard today. And I think you see, I did make some changes this morning when I realized how this was gonna go. And we'll try this out as an experiment. It'll be more fun for me and maybe more fun for you as well. So of course, this whole business began in prehistory. I don't believe there's anyone here who was on the Albert's committee, which originally proposed that we have a genome project, except me. And I was there. I was a sort of proto-opponent because I was afraid that all the money would be sucked out of the system. And actually it's come full circle because I am now afraid that all the money is gonna be sucked out of the system as you'll see as we go along, right? Anyway, we were able, being the basic scientists on that committee, which we were well represented, we were able to convince all hands that A, it was possible to sequence the genome even though at the time it was pretty daunting. Not as daunting as the brain initiative because they're only three times 10 to the ninth base pairs and they're 10 to the 15th synapses, okay? So plus or minus, an order of magnitude here or there. Okay, but we could see our way to doing it on the one hand. And on the other hand, it was absolutely clear that if we tried to do it directly on the human, it would be a major league disaster for many technical reasons, which I'm not gonna rehearse, okay? But those of you who have a few gray hairs will remember, right, okay? So what we decided in the end was that we should do two things. One, we should learn how to sequence on sequences that were much less difficult. And second, we should try to begin right at the beginning to understand what the nucleotides that we were sequencing were actually doing for the organism. And the two of those things ended up in the idea of doing this series of organisms first with the idea that we could understand what we were doing and that in fact we were rewarded as we knew we would be because there was enough data out there already that basic cellular functions of all eukaryotes are carried out by proteins and RNAs that are conserved up and down at least the eukaryotes and actually further. So the idea arose very early that what we were gonna do was we were going to finish the dream of molecular biology to make all of these connections real. That using genetics, which was the study of mutations which at that time in the human was not a realistic possibility but in the model organisms was in fact the main order of business could be related to the function and the function could be related by biochemistry to a protein and proteins could be purified in various things and by molecular biology, sequencing and analysis, we could connect the proteins to the genes by the genetic code and that was the idea. And in fact, most of these associations were made and likely will continue to be made for the human by basic scientists working with eukaryotic model systems because what happens when you find a SNP that does something interesting in a basic cellular function? Like it's very interesting. One of the talks I believe was Dana Farber, Dr. Garroway had the standard phenotype of cancer biology which is that the Krebs cycle goes anti-clockwise whereas the rest of us, it goes clockwise. I don't know when this began but illustrates the sort of sociology of science but the IDH piece, right? How do we know anything about IDH and how do we know what alpha ketoglutarate is and all the rest of that sort of stuff? I can leave that on set. So the intellectual impact of the genomic view was a grand unification of biology because in fact all these genes do do the same thing that Jacques Manot who was, by the way, Jacques Cobb died his partner this week and the New York Times didn't have an obituary and I can't understand it. So I wrote to them and they said, oh, we didn't know he died and so I waited for the obituary to come out. I guess they don't read Le Figaro but they apparently don't read the Washington Post either. So anyway, Jacques has the last laugh because of course what he said is true and so the challenge for the future is to understand not just the mechanisms at the individual process level but the interactions among all the processes and I have to say in my summary mode that I was a little bit surprised at how little a concern there was except in the context of GWAS and dark matter about how we're gonna figure out how these things really do interact. It's one thing to have correlation and anti-correlation but I'm sorry, it's very far from understanding to say that A goes with B and will avoid giving A if B is present. That's maybe the NIH should have a higher standard than that. Okay and then genomics of course makes it possible to explore this higher level of interacting systems as opposed, I mean it's not just organizing medical records that requires a lot of interaction from many different areas. It's all the biology itself is that way. I mean those billions of years of evolution were there for a reason. So long before the genome was sequenced or any genome was sequenced, it was known that there's a very high degree of similarity among genes in eukaryotes and this is just to show that in yeast you can just put in the human gene and you can restore function to things where you lost function. And so I come to what are the fruits of the genome? Now let me tell you how I decided that something was a fruit of the genome. I decided when something was a fruit of the genome is if and only if it had substantial penetrance into the society that it was actually in use, that it was not some promise or some theory or some hope. So when we come to pharmacogenomics, which we will, I would have included a cytochrome P450 because cytochrome P450 is actually being used in real clinics widely. It has real penetrance. I might not include some of the other Atul bute things quite yet. Then there's comparative genomics which is a really major thing. I no longer have to deal with the possibility that yeast is really a prokaryote which was written in the prominent journal Science by a very prominent Rockefeller scientist, nevermind. Okay. Now, of course, a new comprehensive technologies of which there are many, not just the Illumina but also all kinds of other technologies that have become very widely deployed in society, not just medicine but more generally as you'll see. Uses of DNA sequence variation. Nobody thought about sequence variation at the beginning very much. Now sequence variation has a very deep penetration. Functional genomics, which I'm gonna talk about a little bit, mainly in my, the case I know about is the subdivision of tumor subtypes and that has now reached substantial penetration and of course DNA diagnostics. So these are the deliverables. They are not what was promised. Francis Collins gave a talk which I really, in which he, very early on, I think when he just took over about how molecular biology was gonna solve all the problems of the world and it was a great joke and he got a big laugh out of it. I always remember that because we have a very depressing tendency, we as a community. We have a very depressing tendency that when we're being rigorous and we are writing, for example, a grant or something, we're very conservative and straight-laced but if we're talking to just people or to journalists or to donors, we'll promise them anything. And this stuff has a tendency come back to bite us. When we do that, okay. So the first deliverable is the quantitative understanding of evolution from sequence. Lest we forget, there was a time when serious people, even what Krugman would call very serious people, okay, actually thought that evolution was a theory in search of evidence, okay. What the genome sequences have done is they have made this absolutely disappear, okay. So there are people who don't wanna believe in evolution but that no amount of evidence will convince them. I'm talking about the serious people. So what you look here is this was done by Darwin just after he stepped off the Beagle and this is I think the great intuitive insight. And here's the real thing. Not bad, okay, for 1837. You may not know this but correlation had not been invented, clustering had not been invented and in fact, the idea that the amount of distance would be the length of the line. As far as I know, Darwin was the first one to use that metric. Okay, here we go, this is it. And the reason I love to show this is because it looks like Darwin and it's very rigorous, no root to the tree and so forth, so on. It's much easier to see everything in a more standard way and the important thing about this is to understand, oops, the circled part that we are a very small part of the biota, okay. And that includes all the organisms I'm gonna talk about, right. The animals and the fungi and the humans. We're all a little tiny part of the business. There's a subset of that insight which is very important to society in which I think really for which Alan Wilson is probably most responsible but which really has changed everyone's view of society is that there's no longer any question about where we came from and who's most diverse and how the diversity spread and that is again, widely accepted really for the good of society. There's no question that this is having a realistic view of this is a good thing and this is just more of that. And now I come to something a little bit more technical but actually more important and not more important than the origin of humans in the sense of scientific importance but more important in thinking about the future. And this is a multiple alignment of genes from very many different organisms ranging from bacteria to human. And what you see the black parts are identity and the white parts are a near identity. And how do we know what this gene does? Well, this gene, the parent of all of these sequences is a bacterial gene called mute S and it is a protein that repairs DNA. It recognizes mismatches and repairs DNA. And all of these other things are mute S homologs. Okay, so that's very good and this Jonathan Eisen made this alignment quite some years ago and he also was able to trace looking at the sequence in many bacteria. He was able to make a credible and certainly correct inference that what has happened here is that the expectation of the evolutionist that there would be duplication and divergence, there was actually evidence for it. And if you look at this side, you can see that in some of these genes, the blue gene has been lost in some of these bacteria and in other bacteria as the red genes has been lost. And so here you have a living example of the whole idea of duplication and divergence. And especially since by the time you get to eukaryotes, this is you see that these things have become much more complicated. The blue subset and the red subset have each generated subsets and the interesting thing of course is, and here's the part where I depart from the rest of this meeting most strongly because I'm going to talk about function, about what these things do for the organism. And it turns out, although they're all mute S homologs, they do different things for the eukaryote. So MSH4 has to do with crossing over. MSH, a mute S2, MSH5 has to do, I can't see it very well, but they're well marked. One of these is mitochondrial and one of these does mismatch repair of two or three bases at a time and so on. They have become specialists. The evolution has been a specialization. David Kingsley showed you evolution in another context of organ development, but his main evolutionary thing was dispensing with the organ when it wasn't required. That was on the previous slide. This is the history of evolution to produce sub functionalization, if you like, of an important function, DNA repair. So how do we extract functional information from the human genome? Well, DNA polymorphism, SNPs and haplotypes could tell you more about function if and when it's followed up. And in the meeting today, what was not talked about was how you would follow up to know why it is that SNPs so and so has something to do with Crohn's disease via the T cell. That all that in between mechanism stuff, all the stuff that will validate the use of a drug other than the crude thing of do people live or people die, all the mechanism that's required to make everybody comfortable with the use of the drug. We didn't talk about it. And that is a huge lacuna and that has to be filled in in the future. That's just not possible for us to go on with just crude correlation as our only guide. Simple Mendelian, somebody said today that it's 5,000 genes that have been found. The best numbers I could find were a bit less than that, but that's okay. Complex, everybody agrees that they are there but it's a little hazy exactly how it's all gonna get worked out. The Broad Institute has a theory, other institutes have other theories. It's still not clear how that's all gonna play out. But, and then there's pharmacogenomics, which is just starting. I have to say, I believe in all of these things, but in terms of actually having been delivered, the polymorphisms are here, the Simple Mendelian are here, the complex, there's a little bit of action, and pharmacogenomics you heard, I think an excellent summary. There are five or so things that are in general use and what, 30 that you could imagine coming into use within the next few years. So that's a deliverable, that's real. Comparative genomics, I've already talked a lot about this and I'm not going to belabor it anymore. Patterns of gene expression, I'm gonna talk about that in a minute because you didn't hear a lot about that because I don't know, I don't know why, but there are deliverable things coming from that that are very important. I'll tell you a little bit about that. And then finally, the whole systems business, which I'm not gonna tell you about, but in fact, it's the genome that started a whole field of biology which seeks to understand how genes and proteins talk to each other. And that has got to be the way of the future. You cannot understand a locomotive on a diesel locomotive on the basis of knowing in excruciating detail at x-ray crystallography, at angstrom resolution, the structure of one wheel. It's just not gonna be enough. And in fact, knowing the structures of all the parts won't get you there. As David Kingsley said very nicely, he says, from DNA sequence, you can't figure out what the organism looks like. Okay. So many years ago, there was the whole business about using the introduction of a DNA sequence variation. My own history with this is that I suggested you could follow genes this way and you can and many, many genes have been found by linkage to adventitious polymorphisms that apparently don't do anything in the genome. These are the same kinds of polymorphisms that GWAS looks at. And but the discovery of the genes because by genetics, you can discover things that you know nothing about. If you actually follow them up, you learn a great deal. And here are fewer deliverables. Huntington's disease opened the door to a huge class, as it turns out, of amplification of trinucleotide repeat diseases, which the brain seems to have a specialty, special affinity for. ALS, same issue, same kind of thing. Nobody actually thought seriously that oxygenation in the blood, oxygenation in the brain was gonna be a really, oxidative damage in the brain, was gonna be a really serious issue, but that's what that's all about. BRCA1, that was less surprising. Retinoblastoma was really the first time that the Knudsen idea had real legs. And then I come to this. Now, I love these shows because they're back, every plot is the same. So you cannot watch it and have lots of background, okay? And the plot is always the same. Nothing good happens, interviews and so forth. Then somebody finds an epithelial cell somewhere and in a lab of unparalleled beauty and speed of action. I mean, their mass spectrometers work at light speed. Anyway, it all turns out to be DNA evidence. I have to tell you that at the time of the Albert's, after the Albert's committee, somewhere in the mid-90s, I got a call from Bruce and we had a serious discussion about whether we could overcome some of the popular opposition to the genome project by pointing out that if we really work on the genome project, we can actually distinguish every individual from every other individual and absolutely nail criminals without doubt. And at that time, I was very dubious as to whether the general public wanted to be nailed without doubt and didn't go for this. And Bruce, in the end, I think, didn't go down this road. But the fact is that maybe the single most important thing that we have contributed to society as a whole is this ability to identify individuals by their remains, especially in war. Think about catching Osama bin Laden. How do we know it was Osama bin Laden? All right? Think about that, okay. All right. And of course, everybody knows how this works in this audience. Now, I can't, I'm trying to go up, okay. So here is another thing, which is the openness of communication and especially the ability to, with PCR and genomic technologies, to make progress in biology very rapidly with very few missteps. So this is a figure from a paper published in 93 by a Finnish group. And what you're looking at is a bunch of tumors and normal for individual DNA markers. And these are just random DNA markers. And the thing to see is that the tumors have many bands and the normals have just two little families of bands. This is a single locus and this has more alleles. And these are tandem repeats of two or three nucleotides. And so these guys, these dinucleotide repeat polymorphisms, okay. And these guys suggested that maybe the tumors had acquired a loss of function in DNA repair. And these guys who work with yeast read that paper and they said, boy, if that's true, there ought to be such a thing in yeast and they set up a simple yeast screen to find such genes and they did and they sequenced the gene and they discovered that they are homologs of Mute S, remember Mute S, right? Okay, and they are the nuclear homologs of Mute S. And that was six months later and three months after that, okay, this paper came out and in fact it turns out that the HNPCC gene is in fact a MSH2, the mutations are loss of function and 90% of all familial HNPCC have mutations in one or two of these homologs, one or two of these homologs, okay. Finally, I wanna talk just very briefly about gene expression and the idea here is well known to you. You can distinguish human tissues, one from the other by just looking at the intensity of the expression of 6,000 most variable genes and then you can also do this with tumors and you learn a bunch of stuff. The most important thing that you learn at the time was that the then still viable but doddering theory of de-differentiation by tumors was pretty much gone because all of these things are different tumors and they all are more similar to each other than they are to any other kind of tumor. We're not going back anywhere to some primordium and also we could see statistical substructure of tumor types and that was reproducible. I should say we also learned a lot of bioinformatics along the lines here about when support vectors is maybe not the best way to go and subdivision to subtypes is more robust and also we learned that the prognosis of women with different subtypes of breast cancer is different, okay. Regardless of what criterion you use for progression or which criterion, or which method you used to do the typing, and also what country the women are from. None of these things make any difference. Now, of course, you can do a new kind of experiment and we are not doing anywhere near enough of this kind of experiment and the experiment is very simple in concept. If we think there are four kinds of breast cancer, if women who inherit a gene predisposing to one of these, they should only have one of the patterns and so you can do that just by looking for who among the women that we tested had BRCA1 and it turns out that only the red subtype had BRCA1. This is what is now called the triple negative type. Okay, now there are lots and lots of cancer genes and I wanna end with the following thing which no one has mentioned until now. Which is that in when you're wanting to do a new treatment for cancer, then you have to be careful on what population you test this. So in the particular case of Genentech when they were or we were actually organizing trials for Herceptin, one of the things that we said right up front was that we would only try this drug on women who had amplification of the Cognate Receptor. That we would not do all comers and what I'm showing you is the actual clinical trial that the FDA eventually used to approve the drug for very advanced cases of metastatic breast cancer because that's the way these things work. And this is what would have happened if we take in all comers. Herceptin would not have happened. Okay, I submit even today very many drugs fail because the patients have not been sufficiently distinguished by their genotype and by their expression phenotype both of which are completely feasible today and I don't understand why it isn't being done. Okay, I do understand unfortunately some of it but not much. And the important thing to understand is that these women, the magenta women are the women who have untreated her two positive breast cancer. We did the trial in 2003, half of the women who got it in 2003 which is now 10 years ago are still alive. If they were treated not after they had azillion metastases but were treated right after diagnosis and some minimal amount of chemotherapy. Okay, so we can in fact do a much better job by fractionating the patients. The extent to which these drugs work is really impressive. This is from the FDA website for Glevec and this is for her two. You can see how flat the even back then the blue curve is getting. Okay, so clinical applications. You've heard all about this. I'm gonna let you read it like a previous talk and then I come with the issues for the future. So personal genome is a predictor of health. You've heard a wonderful talk I think about what the issues there are. We really don't understand a lot of what goes on and we need to know much more about this and there has to be more experiments. Now what kind of experiments do we need? Well, we need the kind of experiments I submit are the ones where we see the phenomenon in an organism which is more tractable than the human and hopefully the mouse. So the stickleback because it has no prior prejudice is attractive because you can figure this kind of stuff out because David and his guys have figured out what is really a new model system which is very high level in the evolutionary tree. Zebra fish would be fine. Drosophila maybe for some things would be fine. The second issue is how to reconcile interpretation of DNA sequence by doctors and patients been brought up before but in particular we don't teach these guys any math. I'll come back to that, right? And then of course the other issue is the action ability. It's one thing to tell somebody that you have a HER2 amplification which is bad. The bad news is the worst kind of breast cancer you can have except that we have a drug which makes it close to the best kind of cancer you can have for half of you. That's a deliverable, right? Okay, Huntington's disease you're gonna live to who knows what age and then you're gonna go nuts. That is not a deliverable, okay? Nobody wants to know that and we shouldn't push it on anyone. Okay, biology and medicine are being transformed into information science. You heard this many times. Nancy Cox said it I think most clearly but everybody said it to some extent. And everybody has a tendency to look at this computational stuff as sort of handmaidens to the doctors. We'll write a website and it will make a heat map and we'll tell the doctor what to do. With all due respect that's not where I think we should go, okay? I think where we should go is we should understand just as Flexner said in 1917 that it would be good if all doctors knew some biochemistry. I think the time has come that it would be good if doctors knew a little bit more than elementary calculus which they learned in high school which is pretty much the current standard, okay? And it can't be and this is true by the way also of molecular biologists. We are having a lot of resistance even at Princeton, not so much in our own program but in getting other folks to actually learn just to program a computer. Which is after all not rocket science. And the great majority of human genes are not well understood. And I think that although people alluded to this I would like to reinforce this much more strongly. Somebody said, I think it was Garroway said oh yeah, this is switch sniff. Now switch sniff are both yeast genes, okay? In fact most of the genes are, I can trace the origin. The human geneticists do their best to camouflage the origins but sometimes they don't do it very well, okay? But at the end of the day you see, you know, Wint is wingless, Drosophila. Now if you find a gene and you're honest what you'll do is you'll look in the databases, you'll see that you're looking at wingless and you will run and not walk to the nearest Drosophila geneticist and say can I put my allele into your system and see what happens to the wings, okay? Because that's the quickest and fastest way to find out whether, you know, we're talking about a gene interaction with some other thing. And on the other hand, right now, just as I said at the beginning, my concern is, oh boy, oh well, you're ready to ready. I can't do it, okay. My biggest concern now is that in everybody's enthusiasm to translate what we don't yet know to the bedside, we will stop learning what we shouldn't know. Now I am not saying that translational research is a bad idea, on the contrary. I think I've done a lot more than my share of translational research. But I do think that basic research is the only proven way of knowing what a gene does known to us today that is both practical and ethical, okay? And so it's really important for all of you clinically oriented guys. I appreciate what NHGRI has done to transform itself into something that looks more relevant, and I agree with it. But you are the only support for bioinformatics. That means anything at the moment. And if NIGMS, for some reason, gets really hard hit, then there'll be nobody to run to to see if the flies don't have wings. And with that, I'll thank you for the opportunity to hold forth one more time at NIH. And have a good evening. Those of you who can remember as far back as this morning will remember that Eric talked a lot about what was different between 2003 and today. Our next speaker is gonna tell you about what's not different and that is that we are still not in the post-genomic era. So, and he will tell you why. Francis?