 If folks could take their seats, we will go ahead and get started. Well, welcome, everyone. It's great to see all of you, and thanks for coming on this still snow-covered day. As a reminder, this is the second of six seminars that we are having in a series that's commemorating the launch of the Human Genome Project 25 years ago. You may recall that back in December, we had a panel discussion consisting of NHGRI and their perspectives on the formulation and execution of the Human Genome Project. But the next five speakers in this series, which I have listed here, are going to be given by active participants of the Human Genome Project. Now, most of the remaining speakers were, like me, sort of scientific toddlers at the time of the launch of the Human Genome Project. But today is going to be an exception, because the speaker today was already a scientific icon when the Human Genome Project launched. And so it's really with great pleasure that I'm going to start today's session by introducing our speaker, Dr. Maynard Olson. Now the risk, I will tell you, of having me do this, is that my verbal accolades about him could probably go on longer than his seminar. And while this is incredibly tempting, I can tell you, I won't let that happen. I promise. But I do have a lot to say, so bear with me. Let me start with some biographical details. So Maynard received his BS in chemistry from Caltech and his PhD in inorganic chemistry from Stanford doing his thesis work in the laboratory of Nobel laureate Henry Tubb, after which he joined the faculty of Dartmouth College, where, among other things, he taught undergraduate chemistry. But as a chemist, Maynard became increasingly interested in DNA as an information molecule and sought a greater connection with biological research. In particular, he became interested in genetics. And so he uprooted himself from Dartmouth and pursued additional research training in yeast genetics in Ben Hall's lab at the University of Washington in Seattle. Now his work on yeast tRNA genes, which included studying their position in the yeast genome, in many ways served as a foundation for his long-term interests in genome structure. Well, in 1979, Maynard took an assistant professorship at Washington University in St. Louis. And then in meteoritic type fashion, he simply, well, he became famous, is what I could say, for reasons I'm going to describe shortly. In 1992, he actually moved from Washington University in St. Louis to the University of Washington in Seattle, where he subsequently finished his own research career and is now a professor emeritus at the university and, as he likes to say, allowing him to do things that he wants to do, such as being here today with all of us. So along the way, Maynard has received many well-deserved honors. He was a Howard Hughes Medical Investigator from 89 to 92. He received the Genetic Society of America Medal in 1992, elected to the National Academy of Sciences in 1994, awarded the Gardner Foundation International Award in 2002, and then received the Gruber Prize in genetics in 2007, and that's just naming a few of these honors. And I'm actually very comfortable, as you can tell, describing Maynard's talents and virtues because I was fortunate enough to be his postdoctoral fellow from 1988 to 1992. So by knowing him for the past 28 years, I feel that I can answer the following question. What makes Maynard such a unique researcher, colleague, and mentor? And here is where the list can get really long, but I'm going to limit myself to three observations. First, Maynard is remarkably innovative. He sees the research landscape with greater acuity than anyone else I know and has the tenacity to pursue scientific endeavors regardless of their popularity or their difficulty. Now, shown here is precisely that brilliant tenacity, and actually I know because I saw his slides you're going to see at this slide again from Maynard in his talk, it's an early Maynard figure illustrating a fingerprint mapping paradigm that he developed to construct physical maps of the yeast genome. And shown along the bottom are electrophoretically separated restriction fragments from individual phage clones, each containing the segment of the yeast genome, and Maynard reduced the practice of the ability to use such data to construct clone-based physical maps of yeast DNA. And in short, those fragments can be used in a very analytical way to deduce the order across linear DNA such as the yeast genome. And what's important is that this paradigm laid the groundwork for efforts in the human genome project to build maps of the human genome prior to the eventual sequence into the human genome. Now a second of Maynard's attributes that I feel particularly passionate about is that he's nothing short of a spectacular mentor. Now shown here are the two of us in the middle of my postdoctoral fellowship, and over several decades I was fortunate enough to be one of many talented postdocs and graduate students that passed through his laboratory, actually I know some are in this audience. And with his very hands-off style, Maynard mentored these trainees by subtly guiding them to fertile areas and then effectively orchestrating their success. He always served as a prototypic role model by demonstrating the highest standards of integrity and collegiality. In addition, he was incredibly generous. And I've witnessed this time in many, many times where he would, including for me, see that the project started by his trainees and particularly his postdocs would just be given to those postdocs to then help launch their own independent research careers. But I want to really stress from these first two things, don't let his professional appearance fool you. Maynard is actually quite fun. He's actually a joy to talk to about almost any topic, especially when you get a beer in his hand, as shown here. And this photo was actually captured, Maynard at a particularly festive mood at an earlier NHGRI event, standing actually with Val Maduro, who's actually started in Maynard's lab in the mid-1980s making yeast media for his lab when she was an undergraduate at Washington University, and I don't know if she's in the audience now, but Val has been a technician at NHGRI for almost 21 years now, including many years in my own laboratory, and now associated with the Undiagnosed Diseases Program. Well, third and finally, Maynard is the consummate scientific leader. He is a true legend. His vision for research opportunities is insightful and inspirational. Some scientists influence a project, some influence an area. Maynard has been instrumental in transforming a field, and that is the field of genomics. Starting with those crazy restriction maps of yeast DNA, he recognized the value of obtaining detailed knowledge of whole genomes in the form of maps in later sequence, and I can't overstate the key role that he played in developing this strategic blueprint for the human genome project and providing sage advice throughout the entire project and beyond, actually. And I'll also be candid in telling you that his ideas, well, let's just say they're not always popular. Indeed, he is often a lightning rod for debate, advocating positions that go against the common grain. However, eliciting some disagreement is actually a sign of great leadership, because more often than not, you know, he's actually turned out to be right and just simply two steps ahead of the rest of us. Now, meanwhile, he shies away from the spotlight, and he's not necessarily the one that most gets featured for the success of the human genome project, but anyone in the know will tell you about the central role he's played as an international leader for the project and for the field of genomics more broadly. And so it's obviously, as you can tell with great pleasure, that I introduce a personal hero of mine, Dr. Maynard Olson. He's going to talk about genomics, grows up, what have we learned during the past 25 years? Maynard. That's a difficult introduction to follow. But fortunately, I have at least a kind of a humorous Maynard first slide. I'm spending the winter term in Ojai, California, which is in the mountains north of Los Angeles. And when I was starting to prepare this talk literally three days ago, I went out for a walk in my neighborhood and took the picture on the left. That's the Topotopa Mountains at Sunset. And came back and checked on conditions in Bethesda. And so this slide is a sign of my loyalty to the NHGRI, the NIH, and your programs because I'm here still like that today in Ojai. So I found it challenging to sort of decide how to take up this mandate. What have we learned during the past 25 years? And I really, that gets us back to the launching of the Human Genome Project. And 40 years is probably closer to my time trajectory in genomics that the term genomics was only coined in the late 1980s. But like everything, it had a prehistory. What I decided to do is to go through a highly selective, very light version of my trajectory through genomics and to interspers as I go along what I think some of the big picture lessons are. Some of the lessons relate to how technology development works in a complex area, like genome analysis, that was always a major focus of mine. Some of them relate to really science policy issues and what policies have worked and which ones not so well. The policy thread I'm going to try to project, it's the one thing I'm going to try to project into the future if I get that far, and there'll be an actual scientific lesson or two, that is things that we've learned kind of about biology through all of this activity. So that's the plan. Eric mentioned that when I was making my transition from chemistry to genetics in Ben Hall's lab, I worked on transfer RNA genes. And the main lesson here, I'll actually say in advance of just describing the project a little bit, is that this was sort of a perfect project for me. And that you didn't need to know much biology, which I didn't. And it came with sort of a pre-built in genomic view, which is captured really by this map, this is a use genetic map of vintage early 70s when this work was going on, circled our eight essentially unlinked loci that were known to encode tyrosine inserting nonsense suppressors, that is their genetic phenotypes, where they suppress nonsense mutations. You can isolate them as amber, ochre, or umber suppressing alleles, and enough biochemical genetics had been done to show that they inserted tyrosine at the positions of nonsense mutations, so we assumed that they were tyrosine tRNA genes. But the genetic mapping here really stuck with me, this had been done over decades, and this genetic map actually would still look rather good today. There are 17 centromere linked linkage groups, and you can see the very bottom one has dotted lines connecting the arms to the centromeres. That turns out to be the only serious mistake in this map, is that that chromosome doesn't exist, those arms map elsewhere. The other 16 centromere linked linkage groups are the 16 chromosomes of Saccharomyces cerevisiae. All of that rather nicely pulled together using the awesome power of yeast genetics as it's sometimes described. So I started with a kind of genomic view, I was going to try to study some genes that were scattered all over the genome, and I had a rather good overview of kind of where they were, how they were arranged. The genetics were outstanding, and so for people unfamiliar with yeast genetics, the vertical columns of four colonies here are four haploid progeny of a single myotic event, something you can do in yeast, so-called tetrat analysis. Over to the left, we see the diploid phenotypes at the bottom is wild type. So this is being done in a background that had a red colony color marker that was suppressible, non-suppressible. Homozygous suppressor is the white phenotype, the full suppression of the red colony color phenotype. The pink is the heterozygous diploid, and it gives an intermediate phenotype. But of course, when you segregate the haploid segregants, you've got two fully suppressed and two unsuppressed, Mendelian genetics. So because my background was in chemistry, and I was kind of taking this beginner's view of the situation, I had a rather textbook attitude toward this project which proved helpful in this instance. We had Mendelian genetics, and we had yeast DNA, and there was really no tie-in between these two worlds. The genetic world very well developed even then. With respect to yeast DNA, the only thing we knew how to do was to extract it, cut it with EcoR1 and run it on agarose gel. This was our only experiment. But I found this allowed me to really focus, and I still remember my excitement when I first saw gels like this. Because if you look at this, there's structure. The brighter bands at the bottom are repeated segments from the ribosomal DNA clusters, but the rest are just kind of what you would calculate on the back of an envelope for statistical fluctuations in the size distribution of EcoR1 fragments, assuming that the sites were randomly distributed across the genome. I've made in the past the comparison that may seem odd to a biologist, but was natural to a physical chemist. It reminded me of reading of John Kendra's description of the first time that he saw an x-ray diffraction pattern of hemoglobin. It's that this was 25 years before the hemoglobin structure was solved, but he could see that the atoms were all in the same place in every molecule in the crystal. He just had no idea how to find out where they were. So I looked at this and said, wow, something that any geneticist would have taken for granted. I said, wow, the EcoR1 sites are in the same place in every copy of the genome. Otherwise, you wouldn't get a pattern like this, and there ought to be some way of figuring out where they are. So sometimes it's an advantage to be a beginner. That's one lesson, and I suppose the policy lesson there is that we should keep an on ramp in genomics and for that matter, every other field. Science gets more and more specialized as it goes along. Beginners always are at obvious disadvantages in any given lab. It's almost always better to hire somebody with more experience. But fields require influx. And I think it's clear that even the most casual reading of the history of molecular biology is probably one of the strongest bits of support for that in the whole history of science. Well, somewhere in there, I assumed that there were eight EcoR1 fragments that had tyrosine tRNA genes on them. And fortunately, Ed Southern developed gel transfer hybridization between the time when the previous slide was taken and this one, we were working from a Mimigraft protocol. And so I could isolate tyrosine tRNA and label it with I-125 and do gel transfer hybridization. And wow, there were eight bands. Which I labeled A through H. And over on the right, could show that they were competed away by purified unlabeled tyrosine tRNA. And it's the only experiment we knew how to do. But the presupposition was that if we were lucky, these eight EcoR1 fragments corresponded to those eight genetic loci. And so the next question in this sort of beginner's approach was to figure out which of these corresponded to which of the genetic loci. We didn't know how to do that. One observation early on was that these eight fragment patterns, you can see we gradually got better at doing our one experiment. That's what comes from really focusing. We're not always in the same place in different strains. And so that suggested the possibility of using these variants. I assume that they were just missing or extra EcoR1 sites and one strain relative to the other that they could be used as genetic markers. And look at, we could look at co-segregation with the phenotypes. And so this just shows one tetrad. The variant that's, the marked with the yellow arrow is, appears unlinked in this one tetrad to the colony color phenotype and the one with the red, co-segregates with it. And from this one tetrad, one would infer that the red arrow points to the variant that's present in the wild type strain and the one labeled with the black arrow to the one in the suppressor strain. Well, we did a few more tetrads. The LOD score was not impressive, but convincing enough to push ahead with our best case, the previous slide was actually from a 1979 paper at which point we had correctly as it turns out identified all eight of the fragments. And but two years earlier collaborating with Howard Goodman. I had taken the, my best case, it's up for locusts and managed to clone using technology that I was taught by Ron Davis. To clone both the wild type and mutant alleles and to sequence them. And so we learned almost nothing new. From this project, but you can see that there's a kind of a closing of the circle. Very appealing, it was very appealing to me with my beginner's view of biology. The mutant contained the expected change in the anticodon for an ochre-suppressing tyrosine tRNA in this. I believe it was the first sequence of a mutant eukaryotic gene. So it closed a kind of apical story which began with Mendel and continues today. But although bacterial genetics clearly had been integrated with molecular genetics earlier, it was really in yeast that Mendelian genetics became integrated with molecular genetics. So to put the sequencing in perspective, which clearly is a kind of central issue for the human genome project, up until around 2002, the NCBI used to indicate the total number of base pairs present in GenBank. They don't do it anymore because it's no longer a meaningful measurement. Most of the data in GenBank is resequencing of various things and so forth. But this, up to this point, gives some idea of just the growth of sequencing as an activity and so I'm going to come back in a moment to this NRC report in the early days of the human genome project. But our little 100 base pair of sequences sort of out here. So there was a long period in which sequencing was in the baseline, in any plot relevant to the idea of sequencing the human genome. And I'm going to also have some comments about what was going on during this long lag phase, were there technical reasons, or was it just policy obstacles that prevented moving this exponential phase a decade or more to the left? So Eric already showed this slide. The next step for us with the tRNA genes was now we sort of locally correlated equal one fragments with genetic loci, but clearly the next step was to try to do this globally. That is to have a physical map for yeast and to correlate it with the genetic map. And for reasons I think I will not elaborate upon, I settled on this clone fingerprinting strategy. I think the lesson I'll extract from this is that it was needed to have some confidence, a certain recklessness is essential in all research, but some confidence that things are going to get better. Research problems get stuck in a lot of ways. And a key point really about genomics, and certainly was a key point in my career, is that the overall technology base of the society was changing dramatically year by year during this whole period. We were riding a tidal wave of technological change and without that technological change in retrospect our initial goals would have been utterly futile. So the aspect of this slide that I'll just emphasize that point with is that I drew this slide, this is a negative first of all, we're talking film here, but more interesting is what it's a negative of is that I drew this schematic with a Leroy lettering set. Only some of the older people here are going to know what a Leroy lettering set is, but it's a device that has an India ink pen guided by a template on a t-square and on onion skin paper you can sort of letter things like this. It was actually the best available technology in about 1979 or so when I was drawing this slide. So yeah, we had had these great plans that involved a lot of computer analysis and so forth. There were computers of course, but there was no distributed computing. They were centralized devices which had less processing power than Christmas card has today. So we were dependent in ways that we could not really appreciate on this title wave of technological change. I emphasize this because we need to maintain the same confidence today. The tendency is at any stage, particularly after a stage of dramatic technological change to think that you're on some sort of plateau and that things are only to get better in some kind of local and more or less predictable ways. iPhone is going to have a lot more apps and they'll do a lot more of the same sorts of things, most of which we don't need to have done anyway. Technology development really doesn't follow that course and it follows rather unpredictable courses. Of course, I'd be extremely wealthy today if I had had even a glimpse of the trajectory on which technology was going to develop from this standpoint. I knew enough not to invest in the Leroy lettering company but didn't know where to put my bets. We don't know that today, but I believe that the challenges that we face in genomics today are going to require dramatically new technology and if we try to guess exactly what that technology is, we will surely get it wrong. This just shows that we could collect real data and gradually it started to work. The calculations here were done on a mini computer, a Vax, for those of you from that era, programs were written in Fortran, but the basic idea worked. Basically, the notion behind much of genomics is that you try to make the experimental work linear. If you're going to do a genome that's twice as big or you're going to do it with quite twice the depth, you do only twice as much experimental work, picking more clones or molecules or whatever. The irreducible sort of n squared aspect of the problem you do on a computer. By now, I actually had a printer that this map was done with a printer, although I still use my Leroy lettering set to put in some correlation with the genetic map. Genetics has had a collegial kind of character to it, not unblemished by occasional disruptions, but it has had one and we should sort of try to cling to that. I don't know quite how. Early genomics grew out of the model organism communities. I think that's a point that's very important lesson, is that the initiative to do genomics did not come from human genetics. There was actually considerable resistance to the idea of doing genomics. It's a little hard to recapture the reasons for that, but there was, if anything, resistance. Essentially none of the innovation came from the human genetics community. It came from model organisms and I think the important point about model organisms, there are several important points. One of them is that they do tend to favor this kind of wholism, to try to look at the yeast communities interested in yeast, the worm community interested in the worm. To a degree, people in these communities have to be and are interested in the whole picture. They're not as hyper-specialized as people become in organisms where it's simply impossible to look at the whole picture, like the mouse or the human. The complexity is simply too great, particularly the human. There also is an extremely important sort of cultural tradition that goes back to Delbrook in the early days of molecular biology. Sure, there's lots of competition in model organism work, but at least historically, there were complex sort of rules of engagement that favored openness and collegiality. So this is an example is that these two papers were published back to back in the PNES. They were the sort of my first really comprehensive report about the yeast genome mapping project, and this is from Johns Halston, Ellen Colson, and colleagues about the worm map. Sydney Brenner, who was involved in the worm project, suggested this plan for publication. We were in close communication with one another. I think from a policy point of view, we should sort of keep this in mind. There are kind of three simple models by which competition plays out in research. Genomics has experience with all of them, as do most fields. There's this sort of dog eat dog survival of the fittest competition. Sometimes that's effective. It's not much fun. There's this kind of middle road of collegiality in which there's productive exchange, but the laboratories really maintain their independent approaches and are not in intense communication. And then there's the more modern favored strategy of big consortia in which there's forced consensus about every issue. My own opinion is that genomics has gone too far in the latter direction and would be wise as it tackles the hardest problems. That is the ones that we don't really know how to do to try to work with in this intermediate zone. And I certainly am appreciative that I was able to spend most of my career in that territory. There were some technical issues that may seem rather minor, but they loomed large and did influence I think some good advice that people like John Solston and I who had experienced doing these things communicated to the enthusiasts in the early days of the Human Genome Project. Is that anyone who does this, this is just as true today as it was then, we discovered this very early on the yeast project, is that the assembly of the maps was actually limited by the data quality. It was not limited by the ideal computer science and squared kinds of models. Computer scientists have historically taken lots of interest in genomics and they have made some important contributions. But I think it would be fair to say that computer science as a field has historically over idealized the problems. And computer science works on idealized models and has enormous experience, sorting lists of random numbers and so forth by the most efficient strategy with the least likelihood of ending up in some worst case scenario and so forth. Actually none of that theory I interacted with some really top tier computer scientists over this project and I learned a lot from them. But one thing I learned is that we had to solve these problems ourselves because the kind of worst case analysis that characterizes computer science and computational complexity is simply put the worst cases that they worry about never occur in genomics because long before then you get tangled up in data that haven't been filtered adequately or you're not handling non-idealizations. This happens every time whereas the kind of problems that they tend to worry about are too rare to be applicable. And we keep relearning this problem and we will keep learning it. But I think more could be done in this area of trying to deal formally with the assembly of enormous data sets that on the surface of it are governed by simple logical models but in the reality have problems that are difficult to filter out and filtering out is not even always the right approach. So at this stage the yeast project was working and it still went on for a while to close gaps increase continuity and so forth but we wanted to work on larger genomes and so one thing that was apparent to me a lot of the early proposals for doing human genomics were Cosmet based. It's the best available technology of the day and you know the worm and yeast projects were all done with Cosmet and Cosmet's and Lambda at least in their early phases and I think that we all knew that it just wasn't going to work. For a whole list of reasons it just wasn't going to work to analyze mammalian genomes with those vectors. It would be extremely difficult today with a lot more experience in the recombinant DNA technology to do a de novo analysis of a mammalian genome using Fosman vectors for example. So there was interest in trying to make vectors with larger inserts and I'm not going to talk in any detail about the Yak story but David Burke and George Carl were two graduate students in my lab and actually as a side project they both had primary projects in the lab but as a side project they got sort of a yak vector system working in which we could clone these relatively large inserts hundreds of thousands of base pairs with I would describe moderate success. So in parallel with this there was starting to be some real interest in genomics. It was not until the later 1980s that genomics the term was coined and interest began to broaden and intensify. One of the reasons that Sidney Brenner kind of pushed us to publish in 1986 is that he could see that that was that was going to happen and you know the field to need to have some papers to point to with data in them as opposed to talk it was a lot of talk. I did start to spend a little more on my time then on policy issues and you know there I have mixed feelings about that I think any any scientist does that you know it's a zero sum game and I don't have regrets about having been as engaged as I was in policy there were certainly people that were more engaged but it's a choice I made and I would encourage kind of younger scientists scientists today to you know to get involved in policy issues. I as you know the junior member of this committee the alberts committee in 1988 I learned an enormous amount and I think I did contribute something and I think that's my message to younger scientists is that you know they have they have a whole set of different points of view. They almost always have more experience real experience hands-on. Shirley Tillman was the only other member of this really august committee I mean this was Brenner and Watson and Botstein and Hood and so forth but Shirley and I were the only ones that had ever sequenced anything and had any real idea what the issues were at that level and and that's not the only kind of level that needs attention but we should never neglect it we need we need young scientists and they are doing it at some expense to their career and that's a choice they need to make but we hope that some of them will be willing to. One lesson from this report is a policy one and this report is remarkably free of hype. Read this today. This is about as far as we went in the executive summary so we said that it would increase the understanding of the genome would greatly enhance progress in human biology and medicine. This argument actually sold extremely well. I really believe that there is a fundamental misapprehension in in science policy circles that that the only thing that sells is hype. I've dealt with a lot of politicians my policy hat testified three times in front of congress about the human genome project and you know you can say whatever you like about politicians we all like to say things about politicians but they're mostly pretty smart and they have very strong bullshit detectors they they are inundated with you know with statements that can't be taken seriously that's not just in the current presidential campaign it's that this is politics occasionally has more colorful manifestations than others but politicians are actually quite good at seeing through this and and they appreciate actually a message that makes sense to them that was my consistent experience in the human genome project and I believe today we continue to oversell things in ways that distort the the essential nature of what we're attempting to accomplish and don't actually help sell it. Well the task now was to pull together some kind of a parallel with the yeast project and the late days of the yeast project to to try to do similar things in in the human genome and of course need a good postdoc you've already seen a better version of this this picture but I was fortunate to find the perfect postdoc to kind of help us steer into this new world we knew that just scaling up what we were doing and this wasn't going to work for a bunch of different reasons we needed bigger inserts but we also needed different ways of dealing with them that and the PCR was kind of the new kid on the block technically an amazing thing about the NRC human genome report published in 1988 is it does not mention PCR or any variant thereof it's a recombinant DNA kind of view of the world and you know even by 1990 no one in human genetics could even imagine kind of doing anything without relying on PCR but they the very first paper Moeus's first paper in science had had been published before the NRC report went to press and I had read it and frankly found it unconvincing sort of an interesting idea but unconvincing and you go read it I think it is unconvincing but you know technology some technologies die some of them flourish PCR flourished and we recognize that and so PCR assays seem to be the obvious genetic markers rather than sizes of restriction fragments and so Eric's project involved this kind of synthesis of screening technology using PCR to find particular loci particular yaks that contained particular PCR assays and to order the PCR assays and the yaks all at the same time which is the essence of STS content mapping while he was doing doing this anyway we oh yeah we we worked we could build maps so this is kind of the human counterpart of the yeast paper proof of principle for a for a what a couple of million base pairs of human DNA and so this is 10 percent the size of the human genome but you know this was done in a few months once the once the oil once the wheels were oiled and the it was promising promising technology while Eric was doing all the real work here I was still trying to deal with this kind of policy hat and I got involved with you know you need allies if you're going to influence anybody you need allies and so it's good allies by then but can't hurt botstein we we saw a tremendous need to first of all kind of standardized not the methods that people were using to map or the clone libraries or the kinds of clones or whatever but the maps themselves had to be comparable you had to be able to take a map and compare it the maps of that day that were being produced at a rapidly escalating cost really couldn't be compared with one or one another at all they were tied to a particular clone library and without importing the clone library and essentially repeating the kinds of experiments there was just no way to integrate them with other maps and so forth and so that was the gist of the STS idea and again you know I think policy in my experience I have had a lot of failures in trying to influence policy but the successes have always had a simple argument that is easy to state and unembellished and this was our argument that you know that this approach it would solve the problem of merging data from many sources eliminate the need for large clone archives define a physical map that can evolve smoothly toward a sequence and so forth you know you I think that this is the essence of the policy activity we jump too quickly to the politics and it it's actually harder than it looks to decide really what you want to do at the right conceptual level not too lofty where it loses touch with reality and not too detailed where get lost in the weeds so there was still a lot of work to be done here we sequenced a tRNA gene here was the nrc report that I think one of the most interesting phases of the human genome project is the period of not quite 10 years after the nrc report issued major policy developments were were gaining momentum then such as the formation of the national center for human genome research the NHGRI's predecessor and the you know the commitment of the federal government to a human genome project working out of kind of the joint arrangements between the NIH and the DOE and so forth so there were policy issues but there also were technical issues and I believe there's been a lot of misrepresentation of what was going on during this period that it's easy from a standpoint of let's say around 2000 or 1998 to say that there was just a lot of wasted time in here committee meetings and so forth should have gotten on with it a lot earlier a story that hasn't yet been written at all much less well is what was going on then so I mean leahood was on the on the nrc committee and had published with Lloyd Smith the four color fluorescence method at at this during the run-up period to the nrc report because we were very well aware leahood doesn't leave people in doubt about when major technical developments have been made but as of the time of the nrc report it was not even obvious that four color fluorescence was going to be the winner of several technologies but more important than that even for people who guessed correctly that it was going to be the winner is that it simply didn't work very well this is a sometimes well kept secret the reason that four color fluorescence took almost a decade to get off the ground even once commercial instruments were available is that it just didn't work very well it could be made to work under really optimum conditions and but there were several different reasons and this is a characteristic I believe of certainly biological technology and I suspect that if I knew more about it one would find this was a rather general characteristic of technology during this refinement phase is that there's usually not one problem you know if if getting from where you are to where you want to go has some clear rate limiting step you know you need to make the transistors smaller or you need to make the clock run faster or whatever you know then you can focus on that and either succeed or not I have never seen a biological technology that has this characteristic at the refinement stage I mean at the earlier stage you know there are a lot of things that you can either do or not do and and doing it means getting it to work once under really optimum conditions but if you're going to try to make it work as a as a main state technology there always are a whole bunch of problems and no one of them is sort of rate limiting in producing a technology that will really work and so a very short list there are a lot of other things that could be put on this list but a very short list and sort of pay attention to the dates 89 94 95 95 the bottom date is misleading because this is a fill green paper and you always have to discount his public his publications by about five years in the favorable cases so this work was done and you know in the middle of all this the so cycle sequencing was a sort of a derivative of PCR you go and try with a four-color fluorescence or any other kind of sequencing instrument other than the single molecule ones and dispense with an amplification step and yeah you can do it but you're not going to sequence the human genome that way the reason that cycle sequencing you know sounds obvious but it's not at all obvious if you look in detail at the biochemistry of the sequencing reaction what it says is that the template is is the limiting reagent in a sequencing reaction that turns out to be the case but there are a lot of kms and whatnot that influence the fact that that's the case this says you can reuse cycle sequencing says you can reuse everything else in the reaction but I mean they have enough of everything else and that you just need to reuse a template to make enough product linear polyacrylamide you know maybe we could have sequenced the the human genome using cross-linked polyacrylamide but it wouldn't have been much fun pouring all those gels any kind of automation of the process depended that you have pumpable matrices with single base pair resolution and until the the 90s it wasn't obvious that that was achievable there was certainly no theory telling you that you would ever be able to get single nucleotide resolution but energy transfer dies for a major game changer and sort of embedded in this was the shift from labeling the five prime end to the three prime end of sequencing tracks as determinators they had several different effects which I don't have time to get into now but the the most obvious one is that between cycle sequencing and energy transfer dies there was a dramatic change in the signal and noise considerations the amount of template that you needed and the important thing there is not that it's particularly expensive to make a lot of DNA it's that you've got to purify it much better the more you know there's a certain amount of junk in a culture that's always going to be there and you're never going to get rid of it at any feasible cost and so you don't want to be using very much culture because all that junk's going to be there at the end you could do very good for color fluorescence sequencing before any of these things if you had a half a microgram of highly purified template DNA this is not this doesn't scale and the mutant DNA polymerases had also very major effects by eliminating the the huge discrimination between natural trinucleotides triphosphates and and the highly decorated ones that were the basis of the fluorescent labeling and the and and dealing with this ongoing chronic issue that assembly is limited by data quality and that the way that you manage the non ideologies and the data quality is the limiting factor in all assembly methods so toward the end of that period 95 really is it was only then at the constant at the as a consequence of this kind of ongoing progress I finally really got on board with scaling sequencing up and it's not that my voice was particularly decisive but there was a real debate and I obviously was on the side that trying to scale all of this up before about 1995 would have led to unsatisfactory kind of trajectory I don't think it would have ever converged and would have deferred the kind of work that really needed to be done but anyway I started this little science op-ed piece by pointing out that the first time I was ever quoted in science about the human genome project that quote was the one word I had talked for a half hour or something to this reporter the thing that stuck with her was that I had said a huge promise huge and so anyway at least now you know I got a you know 1500 words or something to make my case but yeah I'm going to basically for the moment skip over the key kind of production phase of the human genome project I was actually not not much involved in it there are many people here who know more than I do about kind of the inner workings of that's a different phase of this type of activity and it's not the right person to represent the issues I am going to come back to the public private competition however because there is a lesson here which I think we haven't yet fully learned and is probably more relevant than ever I'm going to fast forward to 2011 a lot of things went on in the decade plus the human genome was to a large degree finished never really finished but this is excellent sequence perfect but it's excellent and a lot of that had to do with follow-up efforts after after those papers were published but by 2011 and a lot of comparative genomics were done field that was pioneered here and became a major activity next generation sequencing was seriously off the ground a number of things happened but in by 2011 looking back 20 years we'd essentially accomplished the goals that were well defined in 1991 and the question is what what next well of course there's no simple answer to that there are many things that come next and I think the you know genomics has been served well by a diversified portfolio even through the production phase that genome genome institute supported a moderately diversified portfolio and fields need a diversified portfolio if they're going to move ahead it's the picking winners problem that's well known in the tech startup world you better try a lot of things fail early and fail often and so I'm not advocating any one path forward but the one that interests me the most and it's quite the topic these days Bethesda and elsewhere is is what's come to be called precision medicine it's interesting I was in China this fall I spent quite I know a lot of Chinese scientists and I spent quite a bit of time in China and they were all talking about precision medicine and this report is actually this has been translated into Chinese and I wrote an introduction to it which is being translated as we speak into Chinese there's a intense interest take that as a don't don't say you weren't told the obvious point is that so the international collaboration is critical in genomics we've seen that over and over again and it's critical in all of science it was certainly critical in genomics but the reality is that when when real push really came to shove in the human genome project there were really only two voices at the table that were largely determinative they the NIH and the welcome trust they had between them the resources to go ahead on the trajectory that they chose to follow and other players really had to fall in line the future of genomics is simply not going to be like this it's going to be multilateral and and I think the primary risk actually in the precision medicine arena is that the international collaboration is going to collapse even unity within the larger countries including the United States could easily collapse and we'll learn things but a highly balkanized effort runs sort of fundamentally counter to the one of the biggest lessons of genomics sort of the public trust kind of commons kind of issue and this is going to be immensely more difficult to achieve in this theater than than it was and it wasn't all that easy in in the human genome project itself so this report I was actually involved in getting this off the ground the with David Walton Alan Williamson Alan Williamson's been a longtime advisor of the NHERI and I you know we're looking for well at a lot of policy experience by then and we're looking for some kind of way to provide some umbrella for the human genotype phenotype translational medicine kind of world and Alan Wilson Alan Williamson suggested the sort of the new taxonomy is kind of the theme and that's where this subtitle comes from that with his reasonable proposal was that you know the focus should be on you know better diagnosis and more molecularly based kind of classification of diseases and that other things good things then would happen and then the committee and and so anyway the three of us approached the building one and and we're pleased with the reception we got there there was interest in and going ahead and and and after some discussion the idea of having the national research council kind of do a lead study in this area was accepted and it was funded by the director's office so I want to say a little bit about my view of the precision medicine challenge the we tried we tried to follow the the lesson of not putting a lot of hype in this report and this is maybe too long a quote but it gives the idea you know there's some things happening the you know everyone who works in the general area that most of us do you know has ongoing frustration with the fact that clinical medicine is it is actually burgeoning and the there are all kinds of economic and social problems and so forth but the ability to treat patients is improving dramatically on many fronts and we're learning immensely more about human biology than we ever thought would be accessible early in the 21st century but the frustration has to do with the limited contact points between these two burgeoning worlds and that's what this report sort of tries to address is how how to kind of do better at compiling organizing manipulating and extracting true understanding from these kinds of data so the the main recommendations are fairly well known and I won't elaborate on them I'm going to focus on this creation of open access information comments which is at the core of it so this is the kind of the much worked over figure that kind of the report centers its core recommendations about and it's this idea that you know we have a lot of individual patients and we're going to gather lots of data about them and the data are going to be organized around the patient so the sort of primary record here is a patient identifier so let's set aside for this moment you know privacy issues and so forth as to how all those things are going to be handled but the yeah so everyone kind of nods that's what we need to do but I believe that there is a naivete about how achievable this is going to be I think business as usual will not get us to where this committee kind of pushed and just for starters you know let me just say that I I'm unaware if there is a single individual in the world that where I could go and have access to vertically integrated data remotely of this sort that included you know clinical records I'm unaware of who this person is and one doesn't create complex systems of this kind in one step overnight some simple set of policy recommendations it has to evolve I think we know how to make it evolve but the recipe that's hardest to follow is one that I've tried to illustrate over and over again here is takes patience and experimentation and I'm relatively unconcerned about the information technology issues surrounding this because I'm basically counting on the same rising tide that I counted on through my whole career yeah they're big problems but those are problems that you know there are trillions of dollars of investment going into making all that work better and that will I believe keep ahead of us I'm really concerned about the interaction with the individuals and I will simply express my view that our whole way of enrolling individuals in studies of this type was not designed for this purpose and is not going to be adaptable to this purpose I'm talking about IRBs and tight coupling of access to databases to funding from a particular agency even a big one like the NIH I've already mentioned the international issues and simply sustainability the issues that we want to track for these individual patients are not short-term outcomes we actually know how to study short-term outcomes we're quite good at doing it we don't know you know how to look at 20 year outcomes 50 year outcomes and increasingly those are going to dominate our whole healthcare system is you know what should you be doing with a 20 year old 30 year old 40 year old to decrease particularly morbidity when they're 70 and 80 and 90 which is where you know we can't afford to do what we're doing now well it takes time to learn about long-term longitudinal effects and so my kind of policy message is that there was a reason that the human genome project took the period of some decades depending on what starting points and ending points you choose to be the success that it was and this is a harder problem it is also going to take a substantial period I won't put a number on it but a substantial period just to figure out how to do this and the policy and the politics need to be sort of directed at that goal one just let's call your attention I won't try to summarize the contents of this but one minor initiative I got involved in with a number of a number of other scientists and policy types is looking at one alternative model to the way that we currently handle the whole informed consent issues I think we're going to need to look at many such models so I was going to finish by talking a little bit about kind of what I think the major scientific lessons of the human genome project have been but I think that's going to have to be for another day thank you absolutely people have to go that's fine but there are microphones and so we will take questions people just step up to a microphone on one of the two aisles Maynard I'm going to beat you you had some comments you wanted to make about public private interactions you want to maybe summarize high level what I can refer you to another paper by actually it says perhaps my most it's fortunate fortunately not one of my well-known papers but it's one of it is it is my most controversial paper published in a fairly obscure source that you know the most genomicist don't read the journal molecular biology I published it there because it was based on a lecture I gave in the year 2000 in Heidelberg at the European molecular biology institute on a at a science and society forum and I so this was a perspective as I say that was published in 2002 but it's a sort of 2000 perspective which was sort of right not thick of the production face of the human genome project and so in this in this in this paper in which I I said what I thought about the public private competition in the human genome project and it it's probably best to let you read it what I I don't think we really learned the lessons from that and briefly I I think that there were major lapses in and basic standards of scientific behavior which were knowingly overlooked because we have a kind of collegial sort of pulling together instinct and science but the reason that I wrote down what I thought they were was memories are short and you know episodes of this kind get respawned repeatedly and I was quite confident we hadn't seen the last of these problems so that in the precision medicine arena just briefly I think one major threat is that even an institution with the resources and political kind of backing that the NIH has risks becoming marginalized relative to the private sector this is a trillion dollar plus industry just in the United States got a china they're absolutely obsessed with with how you know how are they going to take care of this expanding middle class in China as they grow old they don't even have enough children to look after them all these kinds of issues there's intense interest in Europe in these issues and so we're talking there are trillions of dollars on the table and if you believe that if you believe that precision medicine can even have a marginal effect I actually think it long term can do even more than marginal but let's suppose that it could cut costs you know by 10 20 years from now by a sort of if we kind of learned what we don't now know but need to learn well you know 10 of trillions of dollars swamps what the government in our current political environment can do so this needs to be really thought clearly about you know when you have companies like Apple and Google and Microsoft Facebook they Amazon any of them could could mount a privatization effort either directly within their companies or by spin offs or just by investment that would make salara you know look like it was a little circus side show because yeah this is this is the real deal and I don't believe that that this issue has been adequately addressed in the there are a lot of reasons for that but anyway this is my cause struck early the first half of the lecture you talked several points about how it was the quality of the input data more than the sophistication or the power of the analytic algorithms I think yeah point you made yeah and it's interesting because now what I'm thinking about PMI and other sort of private efforts to be gathering data and crowdsourcing data and all these things the underlying hypothesis seems to be that the quality and the heterogeneity and the inconsistency of the data not a problem because we're going to have such powerful algorithms it'll all come out in the end yeah no it's an excellent point I mean it's a generalization of the kind of thing I was saying in a very small theater to a much bigger more complicated problem but I know I agree I think that that big data per se is is already been a disappointment and will become a progressively bigger disappointment as it becomes bigger yeah you need that you need a model for what what you're doing and and you need input data that are that are the best possible we're going to have huge difficulties getting adequate phenotypic data you know straight out of the Mayo Clinic you know especially if we want to merge it with data from the clinical center here you know this is a this is a very hard problem and the notion that people wearing you know some google watch around or something are going to you know be reporting you know useful phenotypic data so nonsense the we need good quality data and you know if we're talking about decades and decades and decades you know there's a saying that uh you know that well you know the reason everybody's genome should get sequenced is it all have to be sequenced once well that's uh and then you know follow through them follow through the lifetime with them I mean this statement is absurd that yeah you know I had my genome sequenced about three years ago 30x coverage so forth just I hang out in the circles didn't cost me anything and and I didn't learn anything from it but if I decided next year to get seriously engaged in analyzing my genome sequence I just had to have it done again because if the new sequence were even marginally better than the one I had three years ago and it would actually be quite a bit better it wouldn't be worth working with those old data and you know people to the extent that genomics is going to you know really impact medicine there's not the right way of looking at it it's uh you know how many times have you had your blood you know uh lipid profiles done and so forth it's going to be like that uh and that's not where the costs are going to be the uh where the costs are uh I'm going to be elsewhere I mean this is another example again a generalization of kind of the my lesson is uh if there were ever a situation in which there's no single rate limiting problem uh you know the whole precision medicine pipeline is is one I mean you can just list and list and list problems that need to be overcome and uh any one of those problems could disappear with a magic wand without having much effect on the whole project and so that's uh hard that's why I've identified the uh you know the kind of the informed consent privacy issue if if there's if there's a rate limiting factor it's the way we handle that I believe that some of some large cohorts that uh you know they're being built around the world are going to have to simply be discarded outright because uh you know they didn't get that right and uh standards change opinions change and uh it'll be easier to start over again and inform consent form and then start over again again and what distresses me is how little study and particular how few pilot studies are being done in this area you know we need to try a lot of different things there none of them is going to work perfectly but there are better ways and there are worse ways the way we're doing it right now is the worst way maybe not the worst but a worse yes just wanted to hear your opinion on how important you thought your local academic environments were and in forming your opinions and kind of growing as a scientist and was it really kind of the department level or was it kind of a few key individuals that were local that helped you oh so kind of cultural question uh I yeah I mean it's a good question I I mean obviously our environments are for everything you know we uh I uh I take departmental level environments very seriously obviously research groups have their own culture and that's very important you know all these things are important key individuals are important but uh in the I have consistently over my whole career been a strong supporter of strong departments and universities I don't care how siloed they are I think this is uh I don't like you know sort of vague departmental boundaries everyone has an appointment seven places and lab space here and lab space there I let you know let departments work out their own way of let members of departments work out their own ways of interacting with people with other specialties and so forth anyway I I I've always been a strong supporter of departments and I you know I used to joke with people you know there's a period when I I got a lot of job offers from academia and from the private sector and they you know they always wanted me to go and work in some industrial park you know where they would build you know some vast space for me and I said look if I wanted to work in an industrial park you know I would have uh done that a long time ago and I'd be wealthier than I am now but uh you know I like I like uh I like academic environments and I think departments are really critical uh just and they you know they shouldn't be huge gotta be big enough to have some diversity but uh and I'm quite skeptical of interdisciplinary programs they uh they I've been in a lot of them I'm just you know thought of as an interdisciplinary person at some point in at Seattle as a professor of medicine and uh genome sciences and genetics and I was an angelic professor of computer science despite some of the kinds of comments I made and uh but uh you know that was all just on paper really I uh you know there's always a creature of one department that's where I had my labs that's where I spent my time and uh and then I had all these other interactions but uh so did other people in the department I don't know the environment's extremely important okay keep it nine o'clock I think we should wrap it up a thank you Maynard as expected thank you