 So please join me in welcoming today's speaker and my longtime colleague Dr. Eric Green. Thank you, Andy. It's a pleasure to be here. As Andy may have mentioned in week one of this series, the series actually started with Andy and I and then eventually bringing Tara Wolfsburg into this picture. I'm starting back in 1995. This is like the 12th time I've given some version of this lecture. I should, by the way, immediately thank Tara and Andy for organizing this iteration of it, the 2016 version. They included my name on that, which was very kind of them. It's purely honorific. I really had nothing to do with organizing this and you should thank them for all the logistical aspects of bringing this year's series together. I'm more of a legacy left as a named co-organizer. But it is a pleasure to be here. I will say that the title of my presentation sort of is very broad with genomics and the genomics landscape as I see it today. I will point out I'm going to emphasize heavily the human genomics landscape for reasons that will become pretty obvious, but I'm going to limit this a little bit, especially towards the end. I should also immediately point out, as you might imagine, I'm fairly boring, I have absolutely no relevant financial relationship with commercial interests. And the other thing I should point out is the major aspect of what I want to try to accomplish today is really context setting, both for those of you sort of using this series as an opportunity to learn a lot of genomics for the first time, but also as a framework to some extent for the speakers that will follow. In fact, usually I give the leadoff talk in the series for that exact reason and my schedule just didn't allow me to go last week. So let me see as I go through why this is very much of a context setting talk. I'm going to first start off giving you some historical context for genomics as a backdrop and then talk about some of the major achievements that have happened since the Human Genome Project ended 13 years ago, but really emphasize, paint the landscape of what the human genomic circumstance is today and importantly beyond. And as I said, really my goal for this as much as anything besides giving a sort of a foundation of information about genomics is to really help you see how the other speakers fit into this landscape that I'm about to paint. So in terms of a historical context, if we really go back even before genomics was brought about as a discipline, it is really important to think about the series of major historical figures and their important contributions to help really fertilize the ground, if you will, of which genomics was able to grow out of. I could probably spend the whole time talking about that. I'm just going to give what I think are some of the key highlights to think about. Obviously Mendel deserves tremendous credit understanding and elucidating some of the basic laws of inheritance. Of course, I had no idea where those that inheritance was actually coded for or where the information resided. Some of the clues started to come about with Meissner's work in the late 1800s when he actually discovered DNA as a molecule. But it really wasn't until Avery and colleagues' discoveries in the 1940s that demonstrated that DNA must be this inherited material, which therefore focused a lot of interest on DNA as an information molecule, setting the stage brilliantly for what Watson and Crick were able to accomplish in 1953. In fact, I would contend that the Watson-Crick discovery of the double helical structure of DNA in 1953 was arguably the most important single biomedical research discovery of the last century. I would certainly contend that the paper shown on the left was the most important publication of the last century. Because what happened with the insights brought about by knowledge of the structure of DNA really set the stage for then really figuring out how DNA was the information molecule of life and how it therefore encoded all the life processes, if you will. That was, of course, in the 1950s. In the 1960s, some of the key things we saw were, for example, the elucidation of the genetic code. Those who don't realize much of that work was done right here on this campus. In fact, just outside of this auditorium is a small museum exhibit talking about some of Marshall Nuremberg's work in elucidating this genetic code that we all now take for granted as the key translator table of going from DNA sequence to protein sequence. Better and better tools started to come about in the 1970s, and particularly in the 1980s leading to DNA cloning, where for the first time we were able to actually isolate and clone and manipulate DNA in the laboratory and even then being able to develop methods such as in the late 1970s coming about and much improved throughout the 1980s to actually read out the G's, A's, T's, and C's within DNA. So this progression from Mendel all the way through molecular biology and DNA cloning in many ways then set the stage for what transpired in the late 1980s. And what transpired were the coming together of all of these tools and technologies to allow us to start thinking about how to go and study in a more comprehensive way our genomes. And that gave birth to this field called genomics. And you may not realize that that word didn't even exist until 1987, at least not in the scientific or literature. In fact, the first use of the word genomics in scientific print came about in this lead editorial of a brand new journal called Genomics in 1987 where they talked about a new discipline, a new name, a new journal and in the lead editorial they talked about this newly developing discipline of genome mapping and sequencing for which they adopted the term genomics and put this into the scientific literature. 1987 was particularly relevant for me because it was the year I graduated as an MD-PhD student reminding myself why I had never heard the word genomics once in medical school or graduate school because it simply didn't exist. So it really is important to emphasize we were only sort of 30 years into this word of genomics as a discipline, so it is a remarkably young discipline and I think its prominence on the biomedical research stage sometimes confuses us to think that, wow, it's been around forever, but it really hasn't. It actually is a very youthful discipline. Now, of course, the reason that discipline was named and the reason that there was a lot of attention in the late 1980s about genomics was because of this idea that was crafted in the late 1980s and launched in October of 1990 this notion of a human genome project, this large, audacious international project that aimed among a number of goals to read out the 3 billion GZ's, T's and C's that constitute the human genome. It's important to point out, by the way, that we did have an odometer moment recently October 1 of last year of 2015 marked the 25th anniversary, 25th anniversary of the launch of the human genome project. It's painful for me to think about because I was a trainee. I feel like it was just yesterday, but it's been 25 years since I was there, literally at the starting line participating in the genome project on day one and involved in it throughout its entire 13-year span. In fact, it was remarkably successful, finished ahead of time, now the original 15 years finished in 13 years. And it's now just sort of a key part of the rich history of biomedical research. I actually had the opportunity to co-write a perspective piece that some of you might be interested in to commemorate the 25th anniversary launch of the human genome project, and I did it with the two individuals who have held the job I now hold, Jim Watson, the original director of the Institute I now lead, and Francis Collins, previously was the director of the institute before I became NIH director, and the three of us wrote this perspective piece. Not so much talking about the science of the genome project, but talking about how many legacy elements were left because of what the human genome project did in terms of changing big biology, if you will. So I point you to that article if you're interested in reading some of the historical aspects, and importantly, the legacy elements of the genome project beyond the base pairs. And in fact, speaking about beyond the base pairs, we are celebrating this odometer moment, this 25th anniversary in another lecture series. I'm going to put in a shameless plug, because in fact we have an ongoing lecture series that takes place right at this podium every month, and in fact Thursday of this week, which I believe is tomorrow, yes it is tomorrow, Ewan Burney will be speaking here at this podium tomorrow at two o'clock because we are bringing in a series of individuals who were there heavily involved in creating and executing the human genome project, including people like Ewan, who really came into prominence at a very young age to help with the genome project and stage, and now really use that as a launching pad for his remarkable career. So if you're free at two o'clock tomorrow, please come here if you happen to miss it. Of course, we videotape this stuff and make it all available on our GenomeTV channel of YouTube. So it's been 25 years since the launch of the genome project and just a little over that since the beginning of this field of genomics. So to review things, I thought I would just talk about sort of what I think are the six key accomplishments or highlights, if you will, of genomics in its first quarter century or so of existence. And in reviewing these six areas, I want to also contextualize some of the signature efforts that you've probably heard about or if you haven't, you should be aware of that really have been incredibly important for moving the field forward. A common theme of this will be some of these efforts are many, most of these efforts are big and they're audacious and they're very much cast in the kind of style that the Human Genome Project was cast in because it was remarkable what it accomplished. And in fact, highlight number one and the six highlights I'm going to give you is that the Human Genome was sequenced for the very first time by the Human Genome Project. And that absolutely is the number one highlight in many ways of the past 25 years. And it's a highlight both because it provided such incredible information about our blueprint, which launched so many other efforts that I'm going to talk about. But it's also important because it launched a whole lot of other areas in use of genomics as a pivotal tool for advancing those fields. And in fact, this is a subset of those areas, but every one of these areas have been remarkably advanced and enriched because genomics has found their way to be impactful in these areas. And every one of these could be a talk in and of themselves in entire symposium. I'm not going to talk about any of this because although Julie Segre will talk about and she'll be one of the speakers in this series on May 18th and will in fact point out an example of how genomics is really in many ways completely changing the face of diagnostics when it comes to infectious agents. My emphasis will be more on human health and human disease and medicine because that's the one that in particular is of the greatest relevance for us here at NIH in particular for my institute, the National Human Genome Research Institute. And as Andy mentioned in the introduction I've been at the institute for about 21 years, I've been in the field since the beginning, but six years ago I became the director and while I certainly was involved in thinking about these things when I was in my previous roles at the institute, certainly when I became director I became increasingly laser focused on thinking about how to facilitate the application of genomics to health, disease and medicine, framing it around the concept of genomic medicine as sort of the ultimate goal if you will. By genomic medicine I mean this as a medical discipline that involves using genomic information about an individual as part of their clinical care and of course important other implications of that clinical use. This of course is largely synonymous with other terms you'll hear, individualized medicine, personalized medicine and later on we're going to talk about precision medicine. I would say our framing of this is really very much limited to genomic information as a sub part of some of these other ways to frame it but we're really going to stay focused on the genomic information as a means of individualizing care. So in thinking about what we want to do as a field to certainly what my institute wants to do as a research funder, we really think about this as a progression where we need to traverse a series of accomplishments to eventually see genomic medicine become a reality. We are grounded heavily in the starting line of the Human Genome Project. That's what we really think started all this, which in some ways means the starting line was 13 years ago because we think once we had a sequence to Human Genome that really then set up a circumstance for accomplishments that I'm about to tell you about and eventually we're going to realize genomic medicine and we're going to see the practice of medicine changed because of the use of genomic information. But this is not going to happen overnight and it's not a sort of a simple one kind of project effort involves many steps, it's going to involve many countries, it's going to involve thousands of scientists, it's going to require an amazing amount of creativity and we can't even anticipate all the things we're going to need although we can anticipate some of them and I think we've done a lot in the last 13 years. I also want to emphasize this is going to require a community, a highly interdisciplinary community of scientists, healthcare professionals and people in all people that are touching healthcare in some form and science and research and they're going to all have to be very very highly collegial and we're going to have to be doing this together for a long time. The analogy I've been using is one of a marathon. I mean really you're going to have a lot of people running shoulder to shoulder it is not a sprint and we have to be in this for the long haul because there's a lot of complexities some of which I'll be unpacking throughout my talk. But that's a small order. I mean you know sort of thinking about how do you go from sort of the base pairs provided by the genome project to actually change how we take care of patients at the bedside or maybe prefer the metaphor from double helix to human health. You know that's going to require some pretty clear and important strategic thinking around this and there's one thing the genomics community I think is really good at as we're really good at being strategic and we're also very good at sort of organizing how we want to pursue things it was sort of in the fabric of what we did during the genome project in fact the way we accomplished the genome project was to every couple of years develop a new strategic plan that would guide the next few years and be willing to by the way rip up a strategic plan when it seemed outdated after a few years and come up with a new one. So since that was sort of culturally what we did to sort of map out the next set of things that needed to be accomplished probably wasn't surprising that literally the day the genome project ended nearly 13 years ago our institute published following incredible amount of consultation with the community a strategic plan for the future of genomics immediately starting after the genome project was completed and it served us well. I will tell you for a number of years but like many things when scientific advances it doesn't serve you well forever because new opportunities come up and in fact we saw those new opportunities by around the end of 2009 2010 in particular and we recognized it was time for a new strategic vision or an updated strategic vision in which we put out once again co-authored by members of the institute again involving a lot of strategic consultation with the community. The big difference for the first time with our 2011 strategic plan which is the one we still use was the incorporation of genomic medicine as a key element as a key goal as I've articulated to you in fact putting it in the title of that strategic plan. So for those of you who have not read this it is you know five years old and we've really looked critically at it very recently and actually still feel it has a very very very good shelf life in terms of just still being very fresh and robust I guess I'm not allowed to give out required reading for this class but I will tell you that a lot of the things that we say in the strategic plan will be on the test so for those of you who really want to know what's on the test at the end of this about three people just left because they thought there really is a test they just dropped the class. You know I can't make mandatory reading but I would strongly encourage you if you're here to learn about genomics this would be a great article to read even though it's 2011 it's still quite relevant to everything I'm talking about and many things I don't have time to talk about if you want to quickly download you can get to a PDF immediately by going to that URL but don't read it while I'm talking because I have a lot of things I want to cover you can read it afterwards. But let me just give you a general overview of what the strategic plan describes as a framework because in fact the framework that we put forth in this serves as an organizing principles in many ways for almost everything we're doing at our institute and I think in many ways is a nice framing of many of the things going on here in the entire field of genomics because what we heard during our strategic planning that led up to the 2011 publication was that it was finally time for the genomics community to be more specific and more sophisticated in describing how they were going to actually go from basic genomic information to actually changing the practice of medicine. It was always a thing we would say during the genome project one day this would be really important for how we practice medicine but now 2011 it was time to actually start to describe a research agenda that would inch you closer and closer to actually changing the practice of medicine. It was important to organize the thinking and programmatically important to know how you were going to develop research programs that helped with this progression from left to right. So at the end of the day we found that we could describe everything we needed to do or most things we needed to do is in five major bins of activities or domains as we called them. Let me introduce you to those domains. I mean one of the first one was saying we were very familiar with understanding the structure of genomes largely what we had done during the genome project and the immediate period beyond but also recognizing that we also needed to understand the biology of genomes how those G's, A's, T's and C's did all of their work and that was an important thing ongoing but was a domain of research activities and with knowledge then about how the genome works it provides you opportunities to use genomics to then understand the biology of disease. How is it that changes in our genomes influence our health and well-being and having a clear domain focused around human disease was very important. Obviously if you start to get insights about human disease it provides you the opportunity to think about how to advance medical science and that would involve clinical research that would eventually give you ability to think about how to use genomics to maybe have a more sophisticated approach to medicine. But just because you have better ways of practicing medicine doesn't necessarily mean you've proven that you're going to change how well healthcare works. So we also sort of put down as a domain of responsibility in many ways for doing research to eventually demonstrate that you can actually improve the effectiveness of how you're caring for your patients using genomic approaches. So these five domains really do represent what we think about and I think as I will now continue to describe my highlights of the last quarter century you will see how we are moving from left to right on that progression through these series of domains eventually finding ourselves thinking about medical science and hopefully eventually improving the effectiveness of healthcare. So with that as a backdrop let me continue with my highlights of the last quarter century of genomics. Well we sequenced the human genome for the first time in the human genome project but while that was incredibly satisfying we were thinking about one day sequencing patients genomes and to do that we needed to make sure that we could cost effectively and highly accurately sequence people's genomes not just once but many many many times. Well to accomplish that we needed to very much reduce the cost of sequencing. The good news is we've done it. In fact the cost for sequencing the human genome has been reduced nearly a million fold since the human genome project first sequencing of the human genome. Now that didn't happen by accident and in fact our institute deserves I think some credit although not exclusive credit but some credit of recognizing that this was pivotally important for that would need to happen once the genome project was completed. And in fact in the strategic plan that we wrote and published the day the genome project ended we said a lot of things in that strategic plan but one of the things of relevance here is we talked about technological leaps that seem so far off as to be almost fictional but which if they could be achieved would revolutionize biomedical research and clinical practice. And we gave as an example we gave several examples but the key example relevant here was an example was the ability to sequence DNA at costs that are lowered by four to five orders of magnitude in the current cost allowing the human genome to be sequenced for a thousand dollars or less. So here it was the genome project ended and we put our names on a nature paper that said we need to now go out and figure out how to sequence a human genome for a thousand dollars. It was a rather audacious claim considering they marked the final day where we had finished sequencing the first human genome but when we added up all the costs associated with sequencing that first human genome it came in at something like a billion dollars and people will argue about whether it was 600 million or 700 million or I just rounded up roughly a billion dollars. And here we were on that day proposing oh we just need to knock six zeros off of that figure and eventually come up with a thousand dollar genome. And while it was audacious it was catalytic. But meanwhile it was catalytic because we decided as an institute to put out a major granting program and that major granting program aimed to collect great ideas from creative scientists around the world actually and their goal was basically to get rid of this because this was the factories that were used for sequencing that first human genome. It was one of multiple factories that were used for sequencing the first genome. And we wanted creative people to come up with some fancy schmancy way to sequence DNA as shown here. An iconic form is something magical and revolutionary that would knock six zeros off of that figure. And the good news is not only did we get creative scientists to come in and we gave grants to and they did remarkable high risk things many of which paid off. The good news was that the private sector met us as partners and the private sector recognized this is what was important too and there was a considerable amount of private sector investment as well. In many cases commercializing things that came out of some of our scientists, our funded scientists effort. And the rest is history. I mean it's 13 years but the rest is history. This has been chronicled in nature articles talking about this program and these efforts. And the graph on the left is sort of an iconic graph that we put out and we have a whole web page that catalogs and has been cataloging the cost of sequencing especially by the centers that we support for big large scale sequencing. And in green is the cost of sequencing. I'll look logarithmic scale. This is the cost of sequencing a human genome. It's just fallen precipitously. And it's because of fancy wonderful new technologies such as those shown on the right. And it's not just the fact that we are getting really close to a thousand dollar genome because that's one thing we are and that's where it's almost a million fold reduction. It's that it's also how quickly we can sequence genome. So to give you a perspective that first human genome sequence is part of the genome project cost something like a billion dollars but it also took six to eight years of active sequencing. That's a long time to get a sequence. But today using new methods a few thousand dollars we're getting close to a thousand but you can also do this in about a day or two. And in fact there's many belief using new protocols we will get this down to a day or less than a day in the coming year. And the other thing that's really exciting about this and these technologies is that whatever you think we use today it probably won't be what we're using two or three years from now. It is very much like sitting at an airport where you know you have a lot of nice planes on the ground but you look over the horizon and there's more planes coming and then there's another one and then there's another one. I happen to know there's some really cool technologies that are sort of early stage aren't ready to be commercialized but they're about the second or third plane in and probably within about a year or two they will be supplanting I think some other the technologies that we currently use today. So it's very very exciting time and it's not letting up. And in fact there's a lot of excitement over nanopores that's the latest rage and devices such as that shown here on the bottom right that literally plug into the USB port of your laptop and can sequence DNA and maybe be able to sequence a human genome within a day if all things work out remarkably exciting developments. And we just sort of stand back and watch this happen. And so there's a lot to be described in these new technologies and in fact one of the most popular lectures in this series of late has been Elaine Mardis who's kind enough to fly here and give a lecture a real leader in sequencing technology and she'll be here May 25th and it's not a lecture you want to miss because I think it always gets the highest YouTube hits if I'm correct. She will describe what I just gave in three slides and she'll talk about it for over an hour and there's a lot to talk about in DNA sequencing technologies. I do want to point out because some of this talk is a little philosophical how important these technological advances have been for the field of genomics. You know and in the history of science you've often seen major inflections in scientific progress because of technological advances. I'll give you a few. You know at least to say the telescope sort of changed the face of astronomy. The microscope changed the face of cell biology and certainly devices such as that shown on the left various radiographic methods really changed the face of radiology as we know it and trust me that's exactly what is happening with these new instruments. Technologies for sequencing DNA in the last 13 years have completely changed the face of genomics. In fact I think it's changing the face of biomedical research as genomics sort of permeates across the entire enterprise of biomedicine. So that's a great accomplishment Elaine will tell you more. Well that's great now we can sequence genomes quite inexpensively and that we particularly want to do because we want to sequence many people's genomes and now we can afford to do it. And the reason we want to do it is we're not just interested in that first reference sequence that just sort of gave us a hypothetical individual it wasn't even one person it was a patchwork of people it was a reference. We in fact want to sequence hundreds thousands tens of thousands eventually hundreds of thousands of people because we want to figure out how all of us are different. The good news is that even so far we've already sequenced now tens of thousands to change the slide fairly soon that we'll have to say hundreds of thousands of human genomes actually have been sequenced in one form or the other worldwide. And that is providing us a very remarkable opportunity to understand how we all have different blueprints. So let me just remind you any one any two of us differ about every one out of a thousand bases as you go across all the letters in your genome those differences are variants at least depicted here as single nucleotide variants a G where other people might have a C or an A where somebody else might have a T and so forth. And so we have millions of those variants compared to any reference or compared to the person sitting next to you. But the great majority of those variants are inconsequential from a biological medical perspective but a subset are very very consequential and we want to know those and by the way it's not that you all have your own private set of millions of variants and most of the variants you have are very common and other people have probably other people in this room have. And that makes it a situation where we could imagine if they're very relatively common if we just sequenced enough people we could develop catalogs of those variants. And then if we had catalogs of those variants and we had really good methods we could probably start figuring out which ones are consequential and which ones are not and which ones might be not so good variants might confer a risk for a disease and which ones might be good variants because they maybe protect you from a disease and they may be associated with some other positive attribute. And so as a result having sequenced that first human genome there was remarkable motivation to start cataloging common variants in the human population. You might have heard about the desire to get single nucleotide polymorphisms and indeed or SNPs and the first effort was signed called the SNP Consortium but that quickly once it started to get some traction gave rise to a sign called the International HapMap Project which attempted to not only catalog these SNPs these single nucleotide variants but also to help us figure out how they relate to one another on blocks of DNA called haplotypes on human chromosomes because it turns out that not all these variants just sort of go all different directions as they're inherited but rather they're neighborhoods of variants that have many variants in a big block of DNA on a chromosome tend to stick together as they get inherited from one generation to the next. And so through a series of publications the last one being in 2010 a lot of information about SNPs single nucleotide variants and their haplotype structure was elucidated and shared with the biomedical research community but right around 2010 or even a couple years before the new methods for sequencing DNA came about that allowed us to really accelerate the pace of discovering variants in the human population and that gave rise to the signature Project 1000 Genomes project which was another audacious large international effort like the HapMap Project and like the Human Genome Project to catalog the most common variants across the world actually. You can see from a collection of by the way a lot of times in genomics we overachieve and so we originally named the Project 1000 Genomes and quickly overachieved so over 2,500 genomes were sequenced in the end collected from 26 populations in the world initially described in this marker paper as they're called in 2010 and then finally cultivating in this remarkable paper coming out last October sort of the final paper, final major paper of the 1000 Genomes Project and what's remarkable about their effort is that we're once upon a time when the genome project ended and we had information about maybe thousands, tens of thousands maybe a variance that we knew existed in the human population at specific points in the genome. The Project 1000 Genomes had sort of gotten us up to a much higher threshold in fact they got us to the point where there are about 90 million places in the human genome we now know are variant across the human population and we know the variants that sit at those particular sites. So we went from tens of thousands to nearly 90 million variants of sites that we know exist and that gave rich, rich, rich catalogs of information that could then be used for scientists to test which of those variants are important. Once again take the last three slides I gave an unpacked in much greater detail when he is here on April 20th talking about population genomics and some of these efforts that I just quickly described to you. Now the other interesting thing about the ability to sequence tens of thousands of human genomes is it begins to give us insight about what any one of our genomes look like because one of the things we're always curious about is when we eventually get to this point of using genomics to take care of patients what is a typical patient's genome what's it like and what can we learn from it. So we're starting to learn this a lot more but I just thought it was fun numbers to put in the back of your head what does your genome look like by the number and if you ever get your genome sequence you'll want to know some of these numbers. I mean for example you have six billion nucleotides roughly in your genome right because there's three billion as the reference sequence three billion nucleotides but you have two genomes right you got one for mom one for dad so when we sequence a person's genome we've seen six billion nucleotides or getting information on six billion nucleotides but a typical person on average when you sequence their six billion nucleotides they have about three to five million single nucleotide variants and if you do the arithmetic that's about what we expected so compared to the person sitting next to you there's about three to five million differences between your two genome sequences and as I told you earlier most of the variants you have in fact the great great great majority of them are common we already know about them they're in the databases you can open up a browser and go to a data resource and you will find that variant but that's not all of them because about 150,000 of those variants a minority of your three to five million are not in databases yet so every single time when we still sequence a new human genome we come up with new variants those are the very rare variants but they're still worth having we keep collecting them and actually what's very interesting is that when we sequence a given person's genome on average we'll find about 60 such variants that did not exist in either parent and of course this is how new variants get created in the process of creating the two germ cells that gave rise to you all that DNA had to be replicated and while there's a lot of DNA repair going on there are some oopsies that happen and each of you is associated with about 60 oopsies most of those I'm sure the great majority are completely inconsequential but occasionally this is how you end up with a genetic disorder in a child that where it didn't come wasn't inherited from either parent because it came about brand new and that would be an example that but again the majority of these differences are completely inconsequential to your health and well-being so that's just a little aside and a lot more being learned about how many of your repertoire of genes how many times your genes mutated and broken so forth and we're learning a lot about that for the average patient well having had sequence the first human genome developing ways to sequence genomes cheaper and cheaper and then going out and actually sequencing many many genomes getting lots of knowledge about variation in the genome it immediately started to and already was happening in parallel wanted to know okay well when you have a sequence difference what does it do how does it influence the viability or any the development any aspect of a creature in this case of a human so to do that we really need to understand how it is that the human genome actually functions and I would tell you 13 years following the end of the genome project there have been profound advances in understanding how the human genome actually functions and let me just remind you that that was not what the genome project was supposed to do this is what the genome project was supposed to do and they did it they basically read out all these letters of course it's only a subset of what the genome there's only .0001% of what the genome project did and it's a complicated language that is very hard immediately grasp where is the important parts and I will tell you that when the genome project ended 13 years ago our tools for actually interpreting the 3 billion letters really weren't that great they were quite nascent they were not bad for genes but for understanding the functional parts of the genome that are not genes were actually quite weak and we had a lot of work to do but we did know a thing or two about genes and in fact we knew that genes had introns and exons and we knew that DNA got made in RNA and that RNA could be alternately spliced and you can get different gene products as a result of that and we were armed with that genetic code I told you about earlier so at least when it came to the protein coding genes we could look up and figure out how they made protein so we were pretty good shape for genes and so we went to it and when I say we I mean the community went to it and went through and quickly highlighted the human genome that looked like they were genes acted like genes and therefore probably were genes for the most part and at the end of the day that only accounted for about 1.5% of the letters of the human genome and by although we still work on the exact number it's about 20,000 genes much lower than we anticipated but that's the number but what was interesting was that we knew there was a lot more functional stuff in there and that we were surprised by how little of the human genome actually coded for genes only 1.5% and we knew there was amazing amount of other choreography that had to be at play to figure out where when and how much genes were going to get turned on and helping chromosomes function all sorts of things and we knew we had to find that stuff all that functional stuff outside of genes and it was interesting because at the end of the day you know we would have thought we'd have really brilliant people on the planet brilliant scientists to help us but at the end of the day we actually look back in time to help be guided how we were going to figure out all the functional sequence of the human genome and probably one of the most inspirational figures and influences were not any of the people listed here but actually someone was listed even before here and that was Darwin and it was sort of interesting how Darwin really came to be commonly discussed immediately when the genome project ended because there were so many things that Darwin taught us in his writings you know he one of his famous things that were at least attributed to him was a species that survives nor the most intelligent but it's the one that's most adaptable to change and he hinted at the idea that something was going on through evolution that something was being changed to have sort of species survive and adapt and eventually you know sort of deal with sort of the evolutionary progression of course we now know that's what's the DNA that's where it's all at and that's why a more contemporary scientist genomicist wrote right around the time when the genome project ended for the last three and a half billion years evolution has been taking notes and those notes are in the sequences of the genomes of all these creatures and so I mentioned Darwin I mentioned that quote because what happened when the genome project ended was a recognition that we needed to do lots of comparisons of our genome sequence with other creatures to better understand how our genome sequence works and we also recognize that we were just as a species just one really teeny little twig on a tree of life of great richness and that buried in those notebooks in the DNA sequence of these other creatures was lots of important information so that's the reason and many of you probably recognize that's the reason why we went off and started sequencing lots of critters and their genomes and first we started with laboratory models mice and rats and companion animals like dogs and our closest relatives like chimps but in fact we needed to understand the power of comparative sequence analysis and sampling more broadly across the tree why we started sequencing lots of other critters selectively and strategically picked off of different trees a sub sweat is here originally it was like 25 species and it went to 100 species now well over 200 species have been sequenced and we used all that rich data to basically start asking questions like what sequences in the human genome are conserved across or across all vertebrates or across all primates because if they're conserved that heavily they don't change over that many years of evolution they must be important because evolution just has a way of going in there and wanting to sort of change things if it's not important and so that was the rationale for moving beyond the human sequence and now sequencing many many many species and in fact what that gave us was remarkable insights about where in the human genome are the most conserved sequences through evolution pointing to the sequences most likely to be functionally important and in doing that you end up with about 5 to 10% of the human genome sequence of 5 to 10% of the 3 billion letters are conserved across almost all mammals and they're almost certainly functionally important but that's 5 to 10% of which the genes protein coding genes 1.5% is a minority so 5 to 10% must be functionally important at a minimum and only 1.5% of that is protein coding genes which means the purple stuff is non-coding functional sequences in many cases conserved as impressively throughout evolution as have been our protein coding genes now what are these non-coding functional sequences doing? Well they're doing a lot probably the thing we know the most about is that they're incredibly important in this complex choreography of gene regulation all these elements all these factors and promoters and silence and so on and so on and so on that's all these sequence elements that are controlling these crazy things that are going on aware when and how much genes are turned on and so that's some of that but we also know that it's not just all gene regulation there's important sequences that help package up chromosomes important sequences that help segregate chromosomes important sequences that help replicate chromosomes and in fact we know for certain there's a lot of complexity in RNA and we have really started to reveal a remarkable amount of function associated with non-coding RNA something we didn't even know about when the genome project began and now that's very very instrumentally important in many biological functions including gene regulation and finally I would contend we should just recognize that there are things we just don't know about in non-coding parts of the genome that are certainly functionally important we just haven't found them and just nobody's written about them in textbooks but they're coming and we're going to figure this out and in fact that's a very high priority area oh and the other thing that's transpired in the last 13 years was a greater and greater and greater appreciation for yet another way that DNA functions not by directly having the primary sequence confer function but by having marks on our DNA put down that influence how DNA functions are called epigenomic marks leading to the whole world of epigenomics and this involves chromatin and methylation and various modifications to DNA and it just turns out that the same methods that you can use for sequencing DNA can be adapted to read out the epigenomic marks in DNA so now we have this incredibly strong ability to read out the second genomic code if you will and in fact I'm going to turn this whole topic over to Laura Korsky who's going to on March 16th come and tell you much more about epigenomics and also about gene regulation the topic I was just talking about so a lot has happened in this arena but boy we also know a lot more needs to happen it actually keeps getting more complicated because I would say in the last five years we've also realized that DNA is not just some innocent little linear molecule that lays out of the nucleus but rather DNA also takes on in the form of chromosomes three-dimensional structures and that these three-dimensional structures also have some functional activities going on with different domains interacting and so the whole world of genomes in three dimensions is unfolding again because of technological advances that we can figure out what those interactions are and that's also a very exciting area of active research. So how do we do this? I mean how do we figure this out? What have we been doing to elucidate genome function? Well it's not simple it will involve I will tell you several generations of scientists to help us fully elucidate the function of the human genome and it also involves a number of different elements if you will I mean I will tell you to start off with that we recognize that like other efforts in genomics human genome project thousand genomes and so forth we needed a team of people working on this to figure it out that is why almost immediately after the genome project ended we launched a program called the encode project encode stands for encyclopedia DNA elements aiming to catalog all the functional elements in the human genome like those color highlights I showed earlier it's actually and it's been going quite effectively we're about to start the next phase of encode over the next year or so it also had a sibling for wild called mod encode for model organism encode and what these efforts encode mod encode basically do is create GPS like views of the genome so here you're looking at one such view and this is just a view of the human genome that you can get to public on a publicly accessible website you zoom in and out and look at it and this is just an overwhelming amount of data generated by encode some of its laboratory based data some of its experimental data but I'm not going to go through it but needless to say it has everything we could possibly imagine at the moment having information about that region of the genome with respect to where the genes where the conserve sequences where's transcription factors binding gets made into RNA where's the chromatin opening up and so forth and that data really provides some insights about where's the functional stuff and the challenge of course is synthesizing all this and really getting a very strict interpretation for every nucleotide what the function is but this is the kind of thing encode has been doing in other efforts there's a big epigenomics effort that went on similarly has contributed to this as well and will continue to do so this is involved not just looking at human DNA but recognizing that model organisms play a very important role in this a very basic science effort to understand genome function that sort of traverses everything from yeast all the way to humans I'll tell you that increasingly computational modeling comes in this is not just about doing experiments and more and more sophisticated computational modeling methods are needed to fully elucidate how the human genome works and as I sort of alluded to earlier there have been major and will continue to need to be advanced in technology development to really figure out all the nuances of the human genome and I will tell you that I John Quackenbush will be here April 27th talking about many aspects of gene expression and systems biology and I'm sure he will touch some of the things that I'm representing in this part in particular in this slide and so I thought I would point that out so where are we in interpreting the human genome well I guess I could say a couple things I sometimes say that it's multi generation which is the first generation I would imagine will be interpreting helping to interpret the human genome sequence you know I sometimes say that you know when the genome project ended we knew about this much about how the human genome works now we know about this much and eventually we need to know about this much so we still have a long way to go we've made a lot of progress sometimes I just sort of joke and say we're sort of at a spark notes view of the human genome for those who know what spark notes is you know maybe it'll get you ready for the test tomorrow but you know it's really a hard problem we're gonna need more than just spark notes and but that's why we will need to keep remembering at this this is this is gonna be a long haul part of the marathon figuring out how the human genome works well here we are we have the sequence we have our blueprint we can easily and cheaply get at other sequences we're beginning to understand how our sequences differ among people and we are starting to get better and better insights about the function of the genome it is time to start thinking about what we originally thought was gonna be really valuable for genomics and that is to be able to start applying our efforts to understand human disease and I would contend there's been significant advances especially in the last 13 years in unraveling the genomic basis of human disease and this certainly deserves the fifth part of the highlight now I will also point out and I'm gonna programmatically just sort of mention that I think our institute has been very helpful at figuring out how to use genome sequencing to study human disease and that really grew out of our commitment to having our especially our extramural research efforts be staying us keeping us at the cutting edge of genome analysis so our largest part of our institute's extramural portfolio is our genome sequencing program which has had a progression starting with the genome project they were the groups that in the United States and particularly the NIH funded parts that were heavily involved in generating the first sequence to human genome and then moving on to help us figure out the sequence of these other genomes as I alluded to and then working on things like the HapMap project and the thousand genomes that I've mentioned and then starting to focus on disease and initially working on cancer which I'm gonna have more to say about in a bit but a very well known project called the Cancer Genome Atlas which we did in partnership with the Cancer Institute which is now in its last phase but then in particular of late and now moving forward focusing on rare and common diseases simply asking the question can you scale up the use of genome sequencing to be able to figure out the genomic basis of rare and common diseases and so let me just remind you because this is really important to understand the differences between rare and common diseases when it comes to the underlying genomic architecture well so on the one hand we have rare diseases these are diseases like sickle cell and cystic fibrosis, Huntington's disease rare in the population but it turns out they're quite simple quite so it's an understatement but in terms of simply involving mutations affecting a single gene these are monogenic disorders also referred to as Mendelian disorders after the famous geneticist Mendel and here it is very clear there's great potential there's over 7000 known rare diseases and remarkably we recognize that with the new sequencing methods we can really accelerate the pace at which we can figure out the genomic basis of rare diseases we've had a number for several years and we'll just renew the program of a program called the Centers for Mendelian Genomics which is a series of centers that truly is doing that tackling these rare disorders in an industrialized fashion to try to figure out the underlying mutations in these genes and they've made great progress along with the other worldwide efforts but you have to recognize there's more progress to be made which is why we have these centers working so hard there are about 7500 rare genetic diseases but we have found the gene or the defective mutations underline those diseases for about 4300 of those 4300 is a remarkable number considering that the day the genome project started we only knew the genomic basis for 61 of those diseases so in a quarter century we've gone from knowledge about 61 diseases to knowledge molecular knowledge about 4300 that's the good news the challenging news is we want to finish this up we want to get the next few thousand and that's what this program is doing and so that's been remarkable advance I think in the last quarter century with respect to rare diseases now what about the other class of diseases common diseases because common diseases are what you're much more familiar with you know rare diseases are rare in the population they're certainly quite burdensome to families and patients but in aggregate that's not what accounts for most hospital visits clinics visit that's not what fill hospitals and clinics it's common diseases that fill hospitals and clinics this is hypertension, diabetes, autism, Alzheimer's, cardiovascular disease and so forth and they're common in the population the hard part is that they're complicated they're complicated because it's not just a single gene in fact it's not even SLA genes we actually believe a lot of it is non-coding functional sequences it's multiple, multiple variants we believe scattered throughout the genome with what is typically a greater contribution of the environment especially compared to rare diseases and so we knew this was going to be really complicated and indeed it certainly has proven to be pretty complicated this of course was why there was a big push once we that's the reason why we wanted to get all these variants cataloged so that we could do studies to analyze these variants in thousands of people and we could do this across the whole genome genome wide and we could try to associate known common variants with diseases like hypertension, diabetes and so forth that's what gave rise to the genome wide association study idea GWAS and GWAS was an idea that on paper everybody hoped would work and it did and started in 2008 was the first published genome wide association study and since 2008 there's been over 2,000 such studies that have come about and in fact there are now about 4,000 places in the genome that we believe have a variant in them that are conferring risk for getting the disease the problem is it's just a risk it's not an absolute usually these are absolutes these are not and it's not telling you which variant it is these GWAS studies only tell you what neighborhood they live in and only tell you a region of a chromosome there's still lots and lots of variants there and so the good news is that we've gone from sort of not knowing how to decipher this complexity to having some really good ideas of where we really need to hunt in greater detail but there's been only a very few examples where we've actually gotten it down to a very specific variant and known what that variant is doing and we need to do that thousands and thousands of times for all these very important disorders so we need to actually what we've learned in the last five years in particular is we need bigger and bigger efforts that's why we actually have now turned the attention of our largest centers in a new program which actually we just formally announced a few weeks ago called the Centers for Common Disease Genomics and these are centers that are going to teach us how to do this we hope along with other efforts in the world where there we now know that you don't just we need to completely sequence the genomes of lots of people with a disease like hypertension or cardiovascular disease or autism and Alzheimer's disease, diabetes and then lots of controls and you need to probably do tens of thousands we now are analyzing and have very large data sets to analyze and then use very fancy statistical methods to tease out which of the variants that are actually the ones conferring risk and so while we don't have as much to report yet with common diseases like we do with rare diseases I would just say we are on a trajectory I think that's going to give us a lot of insights about strategically how we're going to do this and hopefully we will get a lot of new insights over the next five and ten years but I will tell you that all of these efforts especially what's going on now in efforts like this were literally tens of thousands of individuals, genomes are going to be sequenced and analyzed is just an immense amount of data and that's on top of all the other data of all the other projects I've been telling you about and oh by the way I forgot to mention before I go on Dave with everything else a lot more will be said about rare diseases when Dave Valle is here on April 13th and a lot more will be said about common diseases in a much more sophisticated way than I did by Karen Mulkey when she's here on April 6th but what all of them will tell you is that and in fact all of you probably realize is that the world we now live in as these wonderful technologies get us more and more data is that we have the circumstance where we are overwhelmed by the amount of data coming out of the sequencing instruments and in particular and other other new technologies and this is sort of putting us in fact into a new circumstance where the bottleneck is not generating the information the bottleneck is analyzing it and so this bottleneck really is one where these fancy new methods for sequencing DNA are giving us a circumstance where reading out a genome sequence is not the hard part you know the hard part actually is sort of progressing on and then figuring out what to do with the information about the variants in our genomes we hear from other speakers you know there's issues just around hardware and having enough capacity to store all this data there's issues we need increasingly better and better software tools and of course we need a workforce that's able to do all of this which is why some of you are here is to help become that workforce and that also has to include taking it to the final stage of knowing how to take information about individual variants and knowing how relevant that is for individual patients so this is why Tara and Andy in particular dedicate three lectures talk about the data analysis the data science, the bioinformatics, whatever phrase you want to use of everything I'm describing because data analysis is actually sort of the big part of all of this and in many ways is sort of the grand challenge that all of us are working on so I've gone through five highlights of the past quarter century and then of course as a recognition I've said nothing about medical practice as if maybe there wasn't any highlights and probably a few years ago I would have been very limited in what I might have been able to say but I actually think that this is worth putting as a sixth highlight of the first quarter century of genomics because I really do believe there are vivid examples of genomic medicine that are just starting to emerge it is the tip of the iceberg and there will be a lot more but I think it is worth highlighting what some of these are because I really believe that we are seeing genomic medicine come into focus in a fashion that actually is more exciting than when I spoke in the series a couple of years ago and so I thought I would just quickly go through what these highlights are because Bruce Korf will talk about them in much greater detail when he is here talking holistically about genomic medicine but I thought in particular you may want to hear especially from my perspective as director of the genome institute what do I think are sort of the hot areas in genomics and what are some of the programs we are doing to facilitate advances in those hot areas so I'm going to just sort of again go through highlights, I'm going to highlight five areas that I think are the hottest ones in genomic medicine. I'll start with cancer because cancer is the hottest area in genomic medicine implementation I don't need to probably tell a sophisticated audience like this that cancer is a disease of the genome and what happens in cancer is that mutations get picked up by normal cells and eventually make those cells grow out of control to become tumors but those mutations are sitting in the genomes of these tumors and those genomes can be sequenced just like a normal cells genome can be sequenced and with better and better methods for sequencing DNA we can open up these tumors blueprint that's genome read it out and begin to catalog all the sequence changes and that's why efforts like I mentioned the cancer genome atlas did exactly that what that also can do then is start to give better and better information to diagnose cancer and perhaps to think about better ways to treat cancer and there is a remark in the whole credible areas that I couldn't possibly represent I will say that in terms of actually changing the practice of medicine since I'm trained as a pathologist actually I do recognize the diagnostic potential is here you know as one example of many you know for many many decades most cancer diagnostics involved histopathology is the major tool and so that will continue but I already am seeing and for some kinds of cancer that histopathology is augmented by genomic signatures that and genomic profiles if you will of tumors that come out of machines like this and other machines and it's here and now this is not hypothetical it is absolutely here and now for some types of cancer and if you don't believe me just watch television go to websites and so forth you will find a website such as that shown on the top you will find and may have seen advertisements and I keep seeing them all the time they're increasing on television whereby from prominent cancer treatment facilities talking about genomic this, genomic that, talking about the DNA of cancer care and so forth this is mainstream it's used for marketing because genomics is absolutely here to stay with respect to cancer diagnostics and cancer treatment it is the lowest hanging fruit. I think another low hanging fruit is the world of pharmacogenomics, pharmacology meeting genomics recognizing that there is a reason why all of us respond differently, we respond differently to everything that's I'm the guy on the left okay that's me my children are like this right actually maybe it's a bad example because that might imply genetics that's not in play here so maybe that's not a good example but we all respond differently to everything what I really am getting at is not roller coasters what I'm really getting at is medications every medication in this pharmacy in this hospital at CVS Walgreens every one of those medications work they just don't work in everyone but and in fact they are really often often don't work in fact nature had an article about this which I thought was very interesting just recent talking about the imprecise medicine and here are sort of very commonly prescribed medicines and the person in blue in each case is the person where the medicine works and the people in red are the number of people proportionally where the medicine doesn't work well there's a lot of reasons why the medicines don't work but a good part of that is different ways that we metabolize drugs or how it affects us physiologically much of which is due to variations in our genomes and so the idea underline all this and we are learning more and more about this is that we can take individuals with the same diagnosis but do genomic profiling of them get genomic information on them and figure out who has variants that are going to make you a good responder or not so good responder even be a bad responder and do that before you decide on what medication to give a person or what the dosage might be and so forth so pharmacogenomics here and now recognize widely of something and will increase substantially over the next decade which is why Andy and Tara have Howard McLeod coming here a regular in this series to talk exclusively about pharmacogenomics on May 7th third highlight here and now actually this building is a great place to talk about it in is the use of genomics to do our genetic disease diagnostics the notion of having disease strike from nowhere individuals with conditions that nobody seems to figure out what's wrong with them but for a long and many cases these individuals have major amounts of resources spent trying to diagnose them that the idea of just sequencing the genome as perhaps giving a clue just makes a whole lot of sense so as nature pointed out in this article disorders not readily explained by standard test can sometimes be diagnosed by genome sequencing analysis and sometimes it's about a quarter to a third of the time by today's methods will find out what the diagnosis is by sequencing a genome and the notion of undiagnosed diseases undiagnosed conditions really has come to the fore actually deserving a lot of credit activities taking place right here in this building you know patients on these long diagnostic odyssey is going from doctor to doctor medical center medical center nobody can figure out what's wrong with them shown here is Bill Gaul clinical director at our institute but also as the leader of something called the undiagnosed diseases program which took place right here started here in this the NIH clinical center and really reduced to practice the idea of bringing these patients in and having a rigorous clinical evaluation in addition to a genomic analysis to try to see if you can figure out get a diagnosis and not that that yield to diagnose every time but it does quite frequently and it has been a remarkably successful program and is here and now in his mainstream and in fact has just been expanded recognizing its success it is now a nationalized program NIH has its pivotal role right here but we now have established through a common fund program called the undiagnosed diseases network a series of other sites in the country who are doing similar work as well here and now genome analysis as part of diagnostics for rare often undiagnosed conditions also here and now in one case and in another case being contemplated certainly is the genomics of packages under sign called genomics of pregnancy it's actually two stories in one but genomics plays a big role now in pregnancy there's sign that's here and now and sign that I think is important to think about let me tell you each of these the here and now is this we've been doing prenatal testing for many decades actually but we had to access fetal DNA the way you access fetal DNA traditionally is through invasive procedures like amniocentesis, corionic villa sampling but with the new methods for sequencing DNA are so exquisitely sensitive that now we have the community has figured out how to basically access that DNA that gets shed into the maternal bloodstream from the fetus and that cell free DNA can be accessed by a simple blood draw a relatively non-invasive procedure and so the idea of doing non-invasive prenatal genome sequencing and genome analysis is here and now lots of literature about this in fact it's winning lots of prizes over the past few years because it is changing the face of prenatal diagnostics and written about in the popular press as well recent data came out is just I think breathtaking you think oh okay there's a few people doing this they get this blood draw and they get this no no not a few people worldwide millions of people worldwide an article last year you can see the rise of non-invasive methods involving genome analysis by simple blood draws and cell free DNA of the fetus you can see an aggregate it's well over a million well on the route it's expected to be well over a million in 2015 I haven't seen the numbers yet remarkable here and now is actually reducing amniocentesis corionic villa sampling substantially as a result here and now prenatal testing using genome sequencing not so here and now the notion of the other end of pregnancy you get a baby you have a newborn and time magazine thinks that by 2025 everyone's going to get their DNA mapped I think they meant DNA sequenced but I'm not so sure and I think we want to think about that and it's an interesting notion but it brings in a lot of logistical concepts and challenges and certainly a lot of ethical ones as well do we want every child's sequence to have their genomic information carried with them maybe in their electronic life not clear so we actually are studying this we actually set up a program with the child health institute to study this to get a research foundation to think about these things we have a series of sites that are now sequencing the genomes of newborns and asking how does it change their care and we'll learn a lot over the next five years I will highlight one of these studies in particular one of these sites is dealing with not healthy newborns and here Nature talked about an article that investigator is doing which I think is remarkable Stephen Kingsmore where has basically taken acutely ill children in the NICU newborns in the NICU where the doctors have basically given I have no idea what's wrong with the child no the child will die within a matter of days and simply don't know what's wrong with them and he's reduced to practice the idea of getting a small amount of blood and sequence their genomes in less than a day and getting information that in some cases not always but in some cases gives insights about what's wrong in some cases saving the children saving those newborns and in fact I think this is also going to become more common place for acutely ill children in the NICU where they simply don't know what's wrong with that child quickly try to get a genome sequence so another here and now the last hot area is hot not because it's solved but because it's really important and it relates to the development of information systems that are connecting our knowledge of variants to their clinical relevance and I really want to emphasize that generating a human genome sequence today is almost trivial that is just not it's not hard to read out the six billion G's A's T's and C's of a given patient what is really hard is then taking those six billion letters and rounding on the patient the next morning and having any clue what those variants mean I mean we just don't know the vast vast vast majority of those variants we have no idea if they're clinically relevant or not we need to fix that there is a disconnect even between what we know in the scientific literature and what a busy healthcare professional is actually doing in terms of trying to manage the care of patients so we believe that we need to create much more robust and that's why it's hot we need to create this acutely a clinical genomics information systems probably ones that integrate nicely with electronic health records and also ones that deliver very clear guidelines to healthcare professionals in a very simple way because they're going to be looking this up on these kinds of devices in a busy workflow of a nurse of a pharmacist of a physician's assistant of a physician and that needs to be done in a very robust fashion the truth though is it doesn't exist so we've put together a research network called the clinical genome resource or ClinGen which you can read about in this paper and look at this website which is basically trying to scientifically figure out how do you build a knowledge base that could then be used by busy healthcare professionals and so we're at these stages we're not even building it yet we're just trying to figure out how do you take this explosion of literature about variants and which ones are medically important and which ones do you act on which ones do you not do that and reduce it to something that could be looked up quickly so that you can have a patient management be done efficiently with knowledge of that genomic information so that's something to look for but it's we have a long way to go but we're trying to facilitate it so I want to transition before I spend the last 15 minutes talking about this and just point out that I've just described to you sort of a romp if you will through you know a progression over the last quarter century that just started with scientists his audacious is just getting the sequence to human genome but now I was actually thinking about how are we actually going to use this information for clinical management like I described in the last slide it actually is more complicated than that because what we have found is a community of scientists that were mostly basic scientists thinking about genome structure and function and evolution now are confronting the ecosystem of medicine and the genomic medicine ecosystem is turning out it's really complicated because anytime you go to change medical practice you start touching lots of things that are really complicated and that ecosystem is not healthy unless you think about all the things so what do I mean by this well just think about healthcare delivery and reimbursements and all the aspects of changing clinical practice it's like this world it's really complicated parts of our institute parts of the institute you're dealing with this I'm not going to talk about it I'm just going to point out to you that it's now becoming much bigger than we ever thought it was when we're just thinking about just the genome you know there's other aspects of this related to also education and genomic literacy you know we need the language of genomics will be spoken as part of medical care we need a literate public patients to go in you can see here starting to train the next generation of what this is all about we also have a whole profession out there of physicians and physicians assistants and pharmacists and nurses they all need to be literate in genomics and how do you do that when they're at mid career not just when you're getting them when they're being trained and we're thinking about that and these have huge complexities oh and of course we never thought about this stuff when the genome project began but you know there's a lot of regulatory oversight associated with any practice any aspect of practicing medicine and where genomics meets that regulatory oversight there's a lot of stories and sub stories there that we're dealing with so I'm just mentioning these not that I'm going to talk about them but just to point out that it really is complicated and sometimes it's almost daunting and overwhelming so as you know we draw nice graphics that get us over to actually changing the practice of medicine you know there's new surprises along the way and new mountains that we have to climb and I will tell you at times it can get very exhausting because all of a sudden we're dealing with the complexities of healthcare delivery on top of everything else and you can get rather pessimistic at times but as I transition to my last topic I do just want to point out I like this quote because sometimes when I show this graphic people say oh my gosh you're never going to do this and I just think that a pessimist would have that view because a pessimist sees the difficulty in every opportunity I think our community of genomicists you know we see this progression and we see the ecosystem and yes it is daunting but we don't get we don't think of as pessimists we see this as a great opportunity and I think this is why we keep pushing forward to see this progression become a reality even when it really gets complicated so let me transition now I'm going to spend the last 15 minutes on one last topic and I think it is because it will talk a little bit about the future and the future is a reflection I think in many ways of the change of what we have seen in genomics not about the science that I would describe to you for the last little over an hour but even just the relevance of genomics it touches on even the ecosystem aspect I was just talking about because when genomics started you know 25 30 years ago this it was a discipline just involving biomedical researchers you know I was one of those geeks just worrying about you know mapping and sequencing DNA I think there was a pivotal transition when the genome project ended as we recruited healthcare professionals to start to work with us to think about how genomics might change and we're starting to see the fruits of that and the hot areas are real because health professionals are getting involved in thinking about how to use genomics but I think the real change right now is that patients are becoming relevant in this conversation and therefore friends and relatives of patients because genomics is becoming part of the language of cancer care or pharmacogenomics of rare diseases and so forth and you see it all the time in the news and advertisements and newspapers and so forth this is part of society now and there's a lot of issues that become relevant I touched on some of them like education but there's a lot of issues even around public health and so we think a lot about the societal implications and societal complications of genomics and think about some of the public health aspects of it which is why Colleen McBride was invited to come here on March 23rd she's going to talk about more about some of these things but because this is becoming so relevant for all of us and seeing the great potential for this I did think I thought I would just use the last minutes to just update you about some breaking news it's not that breaking anymore but there's a lot happening here it was particularly breaking when it started in June of 2014 because it involved this guy I hope all of you know who this man is and besides being President of the United States this guy really likes science and he really likes genomics I'm proud to say and in June of 2014 he started some conversations actually with Francis Collins our director around the idea of maybe launching a big project near the tail end of his presidency that might involve this vision of genomics, genomic medicine individualizing patient care and so forth because he thought really could have great impact on the future and he wanted to see what he could do in the last phases of his presidency those conversations evolved in the summer of 2014 eventually getting framed around the concept of precision medicine precision medicine really sort of goes a step beyond genomic medicine, genomic medicine would just be the G's, A's, T's and C's shown here what precision medicine is being more precise by starting to account for things like environmental exposure, lifestyle, diet and things like that other aspects of our life that we might be able to use as information to be more precise for medical care it is just a broader context for individualizing medical care to advance human health and what the President saw as a great potential and people who he spoke to began again a great appreciation for the idea that today we really do most of our medical care based on the expected results of the average patient almost everything we do is based on the average patient but the world is changing and there could be a tomorrow where we could be more precise if we would only account for individual genomic differences, environmental differences, lifestyle differences and have that as a way to be more precise in preventing and treating diseases and the President really just wanted to know how can you get from today to tomorrow and so a series of strategic planning efforts went on actually quite small but important over the smaller numbers of people because the President wanted this quiet during the summer of 2014 leading to sort of plans that emerged here at NIH and other parts of the federal government in the fall of 2014 the President was presented this plan, this is actually a picture from that meeting, you can see Francis here, other very important people I won't go through, a meeting that took place in October of 2014 where this plan was discussed and then the President got fully behind it announced it in the State of the Union address in early 2015 and then formally announced it here and shown here in these pictures from the East Room of the White House these are photographs I got to take sitting center and fairly close to the podium where the President announced in January 2015 the launch of this thing called the Precision Medicine Initiative at the exact time that the President announced this in January of 2015 New England Journal published this paper describing it's actually the only scientific paper officially coming out describing the general framing of the Precision Medicine Initiative by Francis Collins and then NCI Director Harold Varmas if you haven't read this I encourage you to read it Harold was an author in part because there's going to be a major cancer genomics along the lines of what I said part of the Precision Medicine Initiative, I'm not going to describe that now I just thought I would briefly describe the other major element you'll be hearing a lot about in the coming months actually hopefully the coming years and even decades and that relates to the Precision Medicine Initiative's launching of a U.S. National Research cohort the idea is to collect and recruit and enlist millions of people at least a million hopefully more U.S. volunteers who will agree to participate in this hopefully multi-decade project and program participants are going to share genomic data, lifestyle information, biological samples all of this will be linked to their electronic health records and this is a big kind of project that not only aims to do incredible studies but also to forge new models for how science is done the idea is to do this in a fashion that fully engaged the participants shares the data very openly in a very genomics like way of having data sharing and also to of course make sure that all their privacy is adequately protected. The notion of having such a big program of involving many years studying lots and lots of people actually isn't brand new none other than Francis Collins when he had a job I currently have about a decade ago actually called for this in a commentary that he wrote in 2004 so you may wonder why did we bring back to the president a decade later an idea that was a decade old that never went anywhere it turned out when this got proposed in 2004 it just didn't get any traction in part because it was a little too early and it was a little too expensive but the world had changed in the intervening decade which is why we brought a new version of this was brought back to the president in 2014 why it got traction then and let me just briefly tell you why the world has changed the last ten years and it very much overlaps with some of the things I've described earlier you know compared to a decade ago genomics has changed I mean think about all the things I've described the cost of sequencing genomes understanding the genome understanding the variance and so forth so genomics breath taking changes in the last decade and that sort of has been very influential for the launch of the precision medicine initiative but there's other areas electronic health records electronic health records are critical for what's going to be done to capture this information electronically but a decade ago it was only about 20% of healthcare professionals and settings they had electronic health records that's why you couldn't what didn't exist the infrastructure didn't exist in 80% of places to collect the information now that figures over 95% in the US of healthcare sites and then meanwhile we have done a lot to learn how to marry genomic information with electronic record information other information feeding electronic health records we have a program that's been going on since 2007 called emerge for electronic medical records and genomics really taught us a tremendous amount of how we might be able to capitalize on genomic information electronic medical record information and other information in electronic medical records and that has served as in some ways a pilot for what's being envisioned for the precision medicine initiative but meanwhile what the president also and every in the congress by the way who's now funding this got very enthusiastic about as a recognition that it's even beyond genomics into other technologies and better and better ways of measuring physiology and environmental exposures and lifestyle and you know this idea of and show you a paper just from last year you know all these new sensors these mHealth devices that measure all sorts of things about our physiology our various analytes our cardiovascular system and so forth this is all sort of just at early stages many people wear FitBits those are recreational devices they're awesome much more robust technologies are coming one could imagine harnessing the power of those technologies having these individuals wear these devices collect the data and have that data streaming in to essential data resources for scientists to analyze oh by the way it'll stream in through their smart phones which a decade ago about only about a million smart phones exist and only 2% of Americans carry a smart phone a decade ago but now over 60% of Americans carry a smart phone so one could imagine having immense amount of genomic data electronic health record data mobile health data all streaming in a million or more people and that will create an amazingly rich data resource but it also means we have a lot of data analysis to do and data science will become a prominent feature of this but you know what as Tira and Andy are going to continue to tell you in subsequent lectures we're in a new world here data science is front and center in biomedical research we've gone up 160 fold in the last decade and we're not totally ready but we're going to have to be ready because this is what we're going to be doing as biomedical researchers but the last element which I do think will be interesting and important to watch that will make this different is the notion of how we will engage the individuals who will be part of the precision medicine initiative and the cohort in particular these people will not be subjects they're not going to be patients they're going to be partners and the reason they're going to be partners is that studies have shown and continue to show that actually most Americans want to participate in biomedical research but they only want to participate if they are sort of engaged from the beginning that they know what's happening to their data they get to opt in and out to things along the way and if they're treated as partners and from the very beginning of the planning the participants are being featured as partners in the scientific enterprise and I think the whole social media Facebook era is very important here and how we will engage them through social media through smart phones and so forth and it will be a new way of doing science I can tell you there's a lot of cohorts that have been created in the United States this one's going to be unlike any of the ones that have preceded it. So there's a lot associated with this if you want to read the current blueprint for the precision medicine initiative cohort there's a working group report that you can read and for you actually the URL will be here it's a very convenient URL to keep in mind this is the landing page for the initiative which hopefully will exist for the next 10 and 20 years or 30 years because a lot of information about the initiative is being put to make this a very transparent process on the landing page for the NIH NIH's effort in the precision medicine initiative so I wanted to end by sort of tying it all together because it's actually very interesting the precision medicine initiative sort of gives me a very interesting sense of deja vu because there are a lot of things we don't really know how it's going to play out it is an audacious yet another audacious effort and there's so many uncertainties associated with it and it happens to be happening exactly 25 years or so after the launch of the human genome project and if you would have asked me and I was there the day the genome project started if you would have asked me well how exactly are you going to map and sequence the human genome I would have said I have no idea but it just is a compelling goal and I think we can do it and we'll figure it out as we go and exactly what we did in getting the human genome project completed we set audacious goals and we were willing to change course as needed and I'm telling you it feels exactly the same now 25 years later with the precision medicine initiative we're launching it it's audacious it has these goals we've never done anything like this before and there's so many details if you ask me or the people who are going to be organizing it on the front line they'll say well we haven't really figured that one out yet but that's okay because they'll figure it out as we go I think the comparison is sort of a and the precision medicine will go on longer actually than the genome project I predict but it's that same audacious willingness to sort of change mid course that I think is absolutely required in order for it to be successful and so I gave you that bit at the end because I think it's very exciting to watch and maybe some of you will participate in this either as part of the cohort or as researchers analyzing the data and I hope that you do and there'll be a lot of news coming out in the coming weeks and months about it as we stand this initiative up in the next year or two. Lastly if the topics I talked about and programs I talked about are of interest to you I will shamelessly plug this because it's free I put out a weekly not weekly that would kill me a monthly newsletter that the institute staff helped me put together that highlight things along the lines of described here and if you want to get that feel free to follow the link on this and you can subscribe to it and get a monthly newsletter for me. So I realize I am coming up exactly on the time that was allocated to me so I will end there and I would encourage anybody who needs to leave they should leave and maybe people who have questions just come down and see me at the podium so that we can finish the official session. Thank you.