 Good morning. I'm speaking on behalf of my colleagues at the Genome Institute, Rick Wilson, Elaine Martis, and my colleagues in our genomics of AML-PPG, but I also will give you a brief update at the end about kind of the scorecard of the AML project at TCGA. So with AML, not advancing. Okay. A few brief words about this disease and its genomics. So we don't know much about it in terms of initiating mutations except for patients who have canonical mutations that I'll show you in just a minute. For most patients with this disease, when we started this project, very, very little was known about initiating mutations. One very nice feature of the disease is that the oncologist control the samples. The tumor tissue is very easy to access and access repeatedly, and most of the samples are relatively free of contaminating normal cells without any additional purification. Another feature that we loved as we started sequencing whole genomes was the fact that many of these genomes are diploid. And also riffing on lessons of the past, low-resolution genomic screening that is cytogenetics has been used for 30 years to classify these patients and to make treatment decisions. And all of us as clinicians who take care of these patients use this idea. It's a very important idea that favorable risk cases who have these canonical mutations, at least three of them, can be treated lightly up front, and they're going to do relatively well. By that, I mean five-year survival is a 50 percent. That some patients with complex cytogenetics have adverse risk. Those patients need to be transplanted in first submission or they will die. And unfortunately, about two-thirds of our cases have intermediate risk. And these patients we don't know what to do with, we don't necessarily need new drugs for these cases, but we need to know what to do with them. So we need better classification markers, biomarkers, if you will, to separate these people into good and poor risk. And the first AML genome we sequenced, we found the major classifier of intermediate risk, which is mutations in a gene called DNMT3A. So I'm not going to talk about that because the data has now been validated by many, many centers and it is the most important classifier as a single gene of intermediate risk and it tells us, I think, who to transplant. But I will tell you about some of the conundrums we faced as we sequenced these genomes. In the first couple of genomes that were done, we encountered what I will call the founding conundrum. And that is there are hundreds of mutations per AML genome and because we validated all of them with deep-digital sequencing, we found out that all the mutations are in all the cells. That's a problem. It suggests that they may have all risen simultaneously or that if they are important for the tumor, if they arose because they're all needed, that you would have to have hundreds of relevant mutations per case. Both of those seem impossible. So how does it happen? This is the model that we prefer and it's experimentally tractable and I think it's now been proven. So the idea is that hematopoietic stem cells, the cell of origin for this tumor, is a cell that lives for your entire life. But it spends most of your life in G0. Divides perhaps once a week, once every few weeks. You only have about 100,000 of these in your body. These cells accumulate about 14 mutations per year. So as you age, a number of random innocuous mutations accumulate in these cells until one fateful day when a true initiating mutation causes a cell to have an advantage, probably a very small one. That cell then begins to experiment with other mutations, progression mutations. Then once in a while, unfortunately, a progression mutation cooperates with this initiating mutation and causes the head. This actually explains most of our data. It explains why all the mutations have the same recount frequency, why they're all in the same cells, because they're simply captured by the act of cloning. The initiating event clones that critical cell and then all the mutations come forward. So when you sequence this genome, not only do you sequence the two relevant mutations, four and five, but the hundreds of mutations that are previously accumulated in that cell prior to its transformation. So there's a central question, of course, in this disease in most cancers. How many mutations does it take to cause the disease? So our approach to this was to take 24 genomes and sequence them completely. But we selected these cases to be of two different types. One the M3 subtype of ML, which is initiated by a very well-known fusion protein created by a translocation called PML-R-R-Alpha versus cases with undifferentiated M1-AML that have normal carry types. We know little or nothing about these cases. So the idea is very simple. We have one that is caused by a very well-defined initiating mutation, and we know it's initiating because you can put it in a mouse and it causes a phenocopy of the disease and one where you know nothing. So based on this prediction that I just made about how the mutations would arise, we predicted that the total mutations per genome would be the same because most of them arose before the initiating event. There are simply background benign mutations that are present in the stem cell. That most of the mutations would be random and irrelevant. But the M1 cases would have null mutations that would never be seen in M3 and that the cases would have common mutations, which would be relevant for progression. Makes sense? So what are we fine? The key, how many recurring mutations per genome? That should give us the answer about how many mutations are needed to initiate M1 and how many are needed to progress. You can think about your predictions in the next few minutes and I'll show you the answer at the end. 24 genome pairs completely sequenced, 10,000 mutations total found, average of 421 per genome, 308 mutations and 286 unique genes, about 10 per genome with translational consequences, about 10 in the exome in each case. We looked at these mutations in 66 additional M1 and 43 additional M3 cases. There are 21 recurrently mutated genes. In M3, there was only one, PMLRR alpha. In M1, we found 10 recurrent mutations and I'll show you what they are in a minute. And 11 mutations were common to the two subsets. The total number of mutations by tier fit the prediction. They're exactly the same in tier 1, the coding region, exactly the same for the two subtypes in tier 2, the conserved region of the genome with potential regulatory function, exactly the same in tier 3 and exactly the same for total numbers of mutations, fits the predictions that they had to arise prior to initiation. If you plot them by genome space, they fit exactly as random events that occurred in genome space in tier 1, tier 2, and tier 3. The R value for the M1 and M3 cases are both exactly one. These are random mutations that occurred prior to transformation in the stem cells. One thing we learned with deep-digital sequencing is that this disease is clonal. Every AML case, I'll show you more about this at the very end, have founding clones where all the mutations occur in every cell and many cases have subclones that are derived from the founding clone and this is very, very important as we begin to think about studying relapse AML and the number of clones in each of the cases is basically identical. Here are the recurrently mutated genes that you all are accustomed to seeing. This is the bookkeeping. These are the M3 cases, that's PMLR alpha on top, here it is cooperating with flip 3 mutations. These are the M1 cases. So you can see that there are a large number of mutations, these are the ones I spoke of, the 10 that occur only in the M1 cases are very, very rarely in M3 and then these are the mutations that occur basically in both subsets, the so-called progression mutations that can cooperate with the founding or initiating mutations. One interesting hit that we got from this analysis was finding that all four members of the cohesin complex are recurrently mutated in AML. This complex is important for holding sister chromatid together, sister chromatid together as to organize during S phase and every gene that's a member of this complex is recurrently mutated in AML, only in the M1 variety. So how many mutations? As I told you, there are the same number of Tier 1 mutations in these two kinds of genomes. If you look at the recurrent mutations with translational consequences in the 24 fully sequence cases, the number is 0 to 6 for M1 and 1 or 2 for PMLR alpha and we extend this to an additional 107 cases, the number stays 0 to 7 and 1 to 3. That's how many mutations it takes to cause these diseases. So in summary, for this part of the talk, PMLR alpha is the initiating mutation for all of these M3 cases that are sequenced. There are cluster of mutations that tend to occur together in PM1, DNMT3A, this classifying mutation I told you about, and IDH1, and then these are the other mutations that appear to contribute to initiation of M1. There are 10 mutations that are held in common in these two subtypes and these are clearly mutations that are important, not for defining the subtype, but important for progression. So I just wanted to give a brief scorecard on the AML project that is being done by TCGA. This is just bookkeeping, but there are some very interesting things that are about to come forward and there's an incredibly rich database that is about to be explored experimentally. So we have 50 whole genome sequence cases that are complete and completely validated, the ones I just told you about, and another 26 with normal karyotype AML from any FAB subtype of the disease. Another 150 cases had exome pair sequence at the Brode, transcriptomes were sequenced from these same cases in British Columbia by Mark O'Mara's group, and 192 methylation rates have been done from this set by Peter Larratt and Tim Trish at USC. We've also just finished sequencing from among this set, the 50 cases that have primary refractory or early relapse disease, which will add richly to our understanding of this worst of the worst subset of the patients. The cases that we chose represent AML as a disease. As I told you, about half the cases have these intermediate risk findings. About 20% have translocations associated with good risk, and about 20% have these poor risk cytogenic abnormalities. So it's a fair sample of this disease as it occurs in the real world. These are just the data on tier one, two, and three mutations in the patients with normal karyotype AML versus the APL or the M3 subtype. You can see, as I told you, the numbers of mutations are about the same. The thing that determines the total number of mutations per genome, you should be able to predict age. There are a number of new recurrently mutated genes. This is a partial list of the recurrent mutations that are present in up to 3% of cases. Many of the names of these genes are new. Not all of them are completely validated. This work will be done within about a month. And then you can see that there are patterns that begin to develop in terms of mutual exclusivity that are being explored by Ben Rafael. And now I'll put a plug-in for his talk this afternoon, who will be showing you data about patterns of exclusivity. There are beautiful data that have come from Vancouver, where they've used RNA-seq to find a number of well-known translocations for this disease. And remarkably, a number of private translocations that create novel fusion proteins in many genes that are well-known to us, but have not been previously identified as translocation partners in AML. Many of these are in-frame fusions that create novel proteins with novel functions, and many of them in genes that have never been seen before mutated in AML, like, for example, DNMT3B. Finally, in work that Tim Trish and Peter Laird have done, they've done a beautiful job of assembling the methylome data for these on the Illumina 450K array. With 192 cases, you can see these gorgeous patterns that seem to privatize individual groups of AML cases. I think all of us predicted that there will be very, very significant mutations that would predict these individual patterns of methylation. And, in fact, that DNMT3A would be the primary predictor, along with the mutations Peter spoke about in IDH1 and 2 and TET2, which occur in about a quarter of cases of AML. As we went and looked at the classifying mutations, basically none of them classify these methylation phenotypes. There's one cluster of mutations that occurs together with DNMT3A, foot 3, IDH1 and 2 and TET2. The common mutations here with a hypo-methylation phenotype, but if you look up this line, all of these mutations occur in combinations, and none of them predicts this phenotype. So there's much to learn. The last thing I want to say is that there's much more to the digital bookkeeping that exists when you look at this kind of data. Deep digital sequencing is a clinical tool. It tells us a great amount about the biology of this disease. By looking at deep digital data, we've been able to deduce the clonal evolution of AML at relapse. As I told you, many of these mutations occur in the cell that is transformed. Mutations in many genes contribute to the initiation and progression. And then subclones arise from these founding clones in most cases of AML. These subclones have different mutations and different behaviors after therapy. Some of them completely disappear with therapy. Others come forward, achieving quantal bursts of mutations that clearly contribute to relapse. Understanding this clonal behavior at relapse will be extremely important in terms of predicting responses to drugs in patients and, of course, defining new therapeutic approaches because what we have to do in this disease is remove the founding clone to cure the patients because every time we look, the founding clone reemerges. I just want to thank our patients. Without them, there is no study. Our funders, including Al Sightman, who funded the sequencing of the first cancer genome when no one else was very interested. And finally, Rick, Elaine, and Lee Ding, who lead the work in the Genome Institute at Wash U, and my colleagues on the Genomics of AML program project grant, most particularly John DePersio, who leads our oncology group. Thank you. Thank you, Tim. We have time for a few questions. Eric. Great, great talk. Just fantastic. I had two specific technical questions. When you say you know how many mutations there are, you mean you have a lower bound. Yes, exactly. There could be more. There could be more. The number that have been looked at so far. But this sets the floor. This sets the floor. Good. I want to check. And the other was just a tiny thing. I couldn't help but noticing on the list, the Cuban sushi domain protein and the musins come up. That's right. And both of those are in this class of late replicating probably fish and cheese. Absolutely. Yep. Absolutely correct. But there are many more that aren't. Right. Yeah, absolutely. Yeah. They're all over the place. And so is Titan, right? What is odd though, that if you simply apply a significantly mutated test for these genes which is very rigorous in terms of things in taking size into account, they don't go away. So I think the reason I leave them on these slides even though my informatics colleagues tell me take them off, they're not significant, is I have an open mind. They don't occur in every case. They should based on their size. They sure as hell don't. A lot of small genes are currently mutated and they show up again and again and a lot of big genes like olfactory receptors never show up in these cases. So it's hard to know what it all means. A lot of big families, that's what I meant to say, big, big gene families that you might expect to be recurrently mutated don't show up at all. Matthew? Great. Great talk. Great talk. I was struck by the PML RAR alpha co-mutation with FLIT3 and wondering about the FLIT3 wild type set, whether from gene expression or other evidence, you can get any clue for what might be a parallel driver mutation to FLIT3? Critical question. We've looked very carefully, thinking that there must be other tyrosine kinases that substitute for FLIT3 in these cases, since the combination is so common, does not exist. There are clearly other co-operating mutations in these cases that aren't a member of this class, so there's distinct heterogeneity even among PML RAR alpha, but it does not explain outcomes. We don't have the answer, but it's an important question. Okay. Thank you, Tim. We have 15 short minutes for coffee break. Thank you. Thank you.