 Thank you, Linda, and thank you also, Elaine, for inviting me to chair this session and to share some of our research as well. It's really terrific to see such a great turnout. Eric gave a tremendously insightful and exciting introduction to TCGA, and one of the messages from his talk is that the torrent of data that is coming out of TCGA greatly exceeds our analytical capacity within the narrow TCGA community. So we really need help from the entire scientific community. So I'd like to start by saying that the talk that I'll be giving today is I'm up here, but I really represent, of course, a team of people at the USC Epigenome Center, but also at Johns Hopkins University, where we have a very close and tight collaboration as one of the GCCs within TCGA with Steve Baylin and Leslie Cope and Jim Herman. And so I'm really speaking on behalf of them as well as the broader TCGA community. So during my talk, I'm going to give you a more narrow look and deeper look into the Epigenome that Eric mentioned several times during his presentation. I'm going to be focusing mostly on DNA methylation, keeping in mind that, of course, the Epigenome is more than just DNA methylation. There's histone modifications, how the nucleosomes are positioned, and various other proteins associated with the DNA, and collectively, this structure has marks that influence the accessibility of that DNA to the transcriptional machinery. DNA methylation is the mark that we focus on primarily within TCGA because it's the one mark that survives processing to naked DNA because there's a covalent modification of the DNA itself, a five-methylcytosine modification. And in many talks in the epigenetics field, we use this lollipop depiction of the potential targets of DNA methylation, CPG dinucleotides, indicated here by these little lollipops scattered throughout a schematic of the genome. Now if you look at which of these lollipops are actually methylated and which are un-methylated in a normal cell, you'll see some striking features. First, the lollipops themselves tend to be clustered in islands called CPG islands, and as well as areas of vast parts of the genome that actually have a depletion of CPG dinucleotides, these low CPG areas, and if you look at which are methylated, the CPG islands tend to be close to promoter regions, un-methylated. That DNA is generally accessible to the transcriptional machinery. The low-dense regions tend to be methylated, indicated by red in this figure. And if you then look at what happens in cancer, you'll notice two things. First, CPG islands can acquire DNA methylation throughout the entire CPG island that's normally not present in that normal cell and in that island, the so-called focal hypermethylation. And when that happens, at a promoter region of a gene that's supposed to be expressed, you get silencing of the gene. The gene transcription gets turned on. It's no longer activatable, at least not very easily. Now contrast with that, a more global loss of methylation in the CPG less-dense region, so the depleted regions, where you often see sort of this erosion of DNA methylation, so global loss. So there are two important features of this slide that I'd like you to keep in mind as we go through my presentation. One is this focal hypermethylation in the context of global hypomethylation, and this has really been known for decades. And today I'd like to shed some light on the relationship between these two events. So Eric mentioned BRCA1 and CDKN2A, P16, one in lung and the former in ovarian cancer. This is a different way of showing the data for BRCA1. This is a plot that Huai Shen put together. And it's showing the expression level of each tumor for BRCA1 versus the DNA methylation at the promoter. And you'll see an interesting relationship. On the left-hand side, you'll see that most of the tumors have a moderate level of expression and very low levels of DNA methylation. On the bottom right, you see a group of tumors that have relatively high DNA methylation and much lower expression levels. These are the so-called epigenetic silenced cases. And on the left, you'll also see where the fallopian tubes are, which we use as one of our controls in red, and you'll see eight fallopian tubes there that they have decent expression and low DNA methylation. Now one of the things that was better easily, more easily to see, easier to see in Eric's presentation was the fact that the germline mutations in green and the purple somatic mutations are actually mutually exclusive with the DNA methylation events, the epigenetic silencing events. So really, we have here a beautiful example of this CPG Island hypermethylation, this focal methylation that I was talking about that is clearly functionally significant from the fact that there's this statistically significant mutual exclusivity with a gene that is known to be mutated in a hereditary predisposed cause of ovarian and breast cancer. Interestingly, though, so you may think that they're equivalent, but if you actually look at the survival curves, and we really need even bigger numbers to really confirm this statistically, but there's certainly a trend that appears as if the mutated cases shown in red have better survival than the epigenetically silenced cases in blue, and this is even after adjustment for differences in age of onset between the two. So we'll have to see how they respond to PARP inhibitors and whether they are functionally indistinguishable. So I'm going to now briefly go over what Eric has already presented. That's the glioma CPG Island methylator phenotype. It's been published. We discovered this when we looked at the first 91 glioblastomas in TCGA and compared the samples against each other shown here in a heat map. And on the left-hand side is a group of glioblastomas that clearly has a very distinct profile of DNA methylation changes. It's very easy to recognize, so we called that G-SIMP for CPG Island methylator phenotype following the example that Jean-Pierre Issa Aminorotoyota created for colorectal cancer. Now in the meantime, our colleagues within TCGA, Rulfer Haak and others, had been working on profiling and characterizing and classifying glioblastomas using expression data, and they came up with four subtypes shown here, pronural, mesenchymal, classical, and neural, shown in these four different colors. And you can see that the G-SIMPs all the way in the left are almost exclusively contained within the pronural subset, but there are some pronural subtypes all the way on the right as well. And so it seems as if pronural is a further substratification of the pronural expression subtype. When we looked at expression, first we looked at the different expression subtypes, and these are the four different expression subtypes, and you don't see very striking differences in survival. The only hint is that the pronural subtype seems to have a slightly better survival, a slightly better clinical outcome than the other three subtypes. But remember, the pronural subtype was composed of both G-SIMP tumors and non-G-SIMP tumors. So the question becomes, well, what happens when we split out the G-SIMP tumors from the other non-G-SIMP pronural tumors? And lo and behold, what you find is that all of the survival advantage of the pronural subtype is really represented by these G-SIMP tumors, shown in red on the right-hand panel. And actually the rest of the non-G-SIMP pronural tumors have a worse outcome, a worse survival than the G-SIMPs. So this really indicates that there's some biology here to this profile and that it's not just a collection of passenger events. There might be something here that is indicative of a different clinical behavior. And one of the other really fascinating findings we found was when we started to take advantage of this comprehensive nature of TCGA data analysis and compared where the G-SIMPs to the non-G-SIMPs of what the distribution of point mutations and other genetic alterations in tumor suppressor genes was. And you'll see here that all of the mutations in IDH1, shown near the bottom of the bar on the top there, all of these fell within the G-SIMP tumors. So there was a perfect correlation between mutation in IDH1 and being a member of the G-SIMP category. And you see the 2x2 table there on the lower right. So this is really a fascinating finding. We don't know for sure what the reason is for this amazingly significant correlation association. But the model that has emerged is that IDH1 may actually, the mutation may be responsible for the aberrant CPG island methylation using the following hypothetical scheme. So we know that isocitrate dehydrogenase normally converts isocitrate to alpha-ketoglutarate shown here. And we also know that alpha-ketoglutarate is used as a substrate by an enzyme called TET, there are actually several TETs, that convert methyl C to hydroxymethylcytosine. And it's thought that this is one of the first steps on the pathway to demethylation. So this conversion of methyl C to hydroxymethyl C is a step along the way to active demethylation. Well it turns out that the mutant IDH1 actually takes the substrate of the wild type IDH1 and actually reverses the reaction but creates a novel oncometabolite called 2-hydroxyglutarate. And 2-hydroxyglutarate actually inhibits TET. And so now you have this wonderful hypothesis here that it's the accumulation of 2-hydroxyglutarate that inhibits the TETs, blocks this vacuum cleaner that normally keeps CPG islands clean and you get accumulation of CPG islands. Now it doesn't explain some of the wild type IDH1 G-SIMP cases and it also doesn't really explain the very characteristic gene specificity. So it's only a subset of CPG islands in the genome that are targeted in this G-SIMP phenomenon, but you could imagine that there are accessory proteins that help recruit TET enzymes to certain CPG islands and that when this regulatory process that is normally there to keep CPG islands clean, when that becomes defective because of this oncometabolite that you then get accumulation of methylation specifically in those CPG islands. I'm now going to briefly show you a few slides that from ongoing work from Huai Shen on cross tumor comparisons, one of the great things about being a member of TCGA is the huge amount of data and large numbers of tumors of different types that we can start to look at. And so here's a bird's eye view of 2,275 different TCGA cancer specimens with 409 normal tissues and this is just a simple hierarchical clustering across all tumors. And on the left you see the matching normals which have a fair overrepresentation of kidney samples, but all of them are in there at different levels. And you'll see a few interesting things. First, the normal tissues for these probes, these probes were selected to represent cancer specific methylation. You'll see that in the normal tissues there's relatively little difference among the different tissues for this subset of probes. So among the different tissues the cancer specific methylation actually does not seem to target the same sites that are tissue specifically differentially methylated. And on the right hand side you can see that mostly these cancer specific methylation profiles break down across different cancer types, different tissue types, different cells of origin for cancer. Now you may think that that's obvious, they're different organs and therefore different expression profiles and that that is why we see differences in DNA methylation patterns among the tumors. But remember this is looking only at the cancer specific accumulation of methylation. Again on the left there's not much difference among the normals. So even though all of this methylation or most of it happens in the process from a normal cell turning into a cancer cell you get these characteristic differences in DNA methylation profiles. Now one of the things you can do with this data is to look at correlations among different tumor types. And Hui did this actually for most of the probes, about 18,000 on the 27k alumina infineum platform. And this shows a broad overview of all the two by two correlations, scatter plots of every combination of tumor types that we had enough samples on plus Hui also split some of the cancer types into say SIMP versus non-SIMP, G-SIMP, etc. And in breast cancer she split out basal carcinoma. So what you see is that several interesting things. First the gastrointestinal malignancies, so stomach, I apologize for the jargon of the TCGA tumor types, STAD is stomach adenocarcinoma, colorectal adenocarcinoma, rectal adenocarcinoma, those all cluster sort of in the middle together, but the SIMPs are even more similar to each other in gastrointestinal SIMP than they are to their respective tissues. So there's clearly probably a common molecular defect that arises in the lower GI tract that can cause this very characteristic SIMP phenomenon. I didn't have time to show it, but we found that this is highly significantly associated with BRAF mutation. Secondly, you see at the bottom several female hormone-driven malignancies. You see breast cancer, uterine corpus, anometrial cancer, and ovarian cirrus cancer. And there you see some interesting things. First of all, that they cluster together, which is of interest. And secondly also that basal carcinoma of the breast is more similar to endometrial cancer and ovarian cirrus cancer. There are other things in here, it's fun to explore this, looking at germ-layer origins and things like that, it's really a fascinating set of data to work with. But I need to move on to discuss going whole genome, and I'm going to show you some examples of whole genome bisulfite sequencing and comment on what we think that that might tell us about some epigenetic origins of cancer. So today I'm going to share with you four TCGA cancers, tumors that have been fully sequenced by whole genome shotgun bisulfite sequencing. We aim to cover about 20-fold coverage, that's enough with the high correlation of CPG methylation locally to get a very good picture of the DNA methylation profiles genome-wide. So we tend to cover about 90% of CPGs at least five or ten times, which is really quite significant. We're talking about 25 million CPG dinucleotides in the haploid genome. And we've also sequenced the normal adjacent counterparts for each of these. And so we have a grade one endometroid uterine cancer, a squamous cell lung cancer, a basal-like subtype of breast cancer. And this is actually one we knew from our infineum data that there was actually relatively little CPG island hypermethylation, I'll come back to that, and a colorectal adenocarcinoma. So this is a genome-wide view of all five adjacent CPG windows showing on the horizontal axis the methylation level in that window from low to high, left to right, and on the vertical axis in the corresponding tumor. And this is really a map of the read density. And so you can sort of see this as what part of the genome, what fraction of the genome is present in methylated in both normal and tumor, which would be at the top right, or un-methylated in normal and un-methylated in tumor, which would be the bottom left. And on the top left, you see anything off the diagonal is either hypo or hypermethylated in cancer. Well, let me start with the hypo-methylation. So anything below the diagonal has less methylation in cancer than in normal. And you can see this sag under the diagonal on the right-hand side. So the vast majority of the genome is actually fairly heavily methylated. These are a lot of intergenic regions and intra-genic regions, but excluding the CPG islands. And you see that that sort of loses methylation in the tumor. But there's also this focal hypermethylation that I told you about in the beginning shown on the top left. And you see that little yellow blip there. Those are the CPG islands that are acquiring abnormal methylation. Let's look at some other cancers. It's not advancing. There we go. So here's a breast cancer. And you see something similar with respect to the top right, although we don't see as much hypo-methylation clearly. And we also are really lacking most of the CPG island hypermethylation. And the bottom left here, we see the endometroid cancer. And you see some CPG island hypermethylation, but not as much in the colon. But you see quite severe hypo-methylation. And finally, lung cancer, we actually didn't have the normal sequenced yet, so I'm showing it here compared to the normal endometrium. But again, severe hypo-methylation. And on the top left, some hypermethylation. Now, I want to focus in on this CPG island hypermethylation on the top left. So these are what Ben Berman describes as methylation-prone regions in the genome. And what Ben did was look at various different chromatin and genomic features of these methylation-prone elements and found that they were heavily enriched for marks that there's called polycomb repressive marks and poised promoters, bivalent promoters in embryonic stem cells. And this is something that we had found with simpler technologies a few years ago. So genome-wide, if you look at the very specific CPG island hypermethylation, there's a huge enrichment, more than 100-fold, for these poised promoters. So let me say just one or two things about this. The trithorax and the polycomb systems were first discovered in Drosophila, and they really counterbalance each other in early stem cells in development. And the trithorax marks are activating marks. And these are modifications of the histone H3 tails. And the polycomb is a repressive mark indicated by histone H3K27 lysine-27 trimethylation. And there's a subset of about 1,000 genes present in most multicellular organisms, including Drosophila and humans, that are targeted by both of these marks in stem cells, such that they're not actually transcriptionally active, but they're kept in a poised state ready to be turned on when the cell needs to start differentiation. And these targeted genes, targeted by these two marks, are actually mostly transcription factors and differentiation factors. So they're very important for lineage differentiation and development. Now this is an overview of about 1,000 of these targets compared to 16,000 non-polycomb targets. And what I'm showing you here is the methylation level in stem cells, embryonic stem cells on the horizontal axis, and in the normal colon on the vertical axis. And in yellow are the polycomb targets. And you can see that the polycomb targets are mostly to the left, which indicates that they don't have DNA methylation in embryonic stem cells. But there's already some acquisition of methylation in normal cells. So these poised promoters that are needed to differentiate and develop for differentiation and development are unmethylated in stem cells. They don't have DNA methylation. They just have this poised state. And there are some of these targets that have acquired methylation in normal colon. Now what happens when we look at colorectal cancer? What I'm going to show is I'm going to transition to the cancer and watch what happens to the yellow dots. And what you see is that the yellow dots move up. And so a process that has started in normal cells becomes exacerbated in cancer cells. So a methylation of what should be normally a reversible repression in stem cells acquire some irreversible silencing in normal cells, and that becomes exacerbated in cancer. So what does that mean? Well the model that we put forward a few years ago is that stem cells have these very important genes kept in a poised state. And the stem cell renews, of course, but of course also spins off differentiated cells. And if you look at where the polycomb group proteins are, in yellow, these are actually removed in the differentiated cell. These genes that are required in that cell are turned on and the repressive marks are gone. What we believe happens over time, over our lifetime, as we get older, is that slowly there's a crosstalk between the polycomb, the yellow polycomb, and DNA methylation machinery, which turns a transient repressive state into a permanently silent state. And once that happens at enough different genes that are required for differentiation, the cell loses its capability to differentiate. And it becomes stuck as a self-renewing cell unable to properly differentiate. And that is then a sitting duck, really, to turn into a cancer cell with additional mutations in oncogenes and tumor suppressor genes. So this model explains the methylation behavior about half of cancer-specifically methylated genes. It is consistent with the observation of epigenetic field effects adjacent to tumors. It's consistent with the stem cell-like behavior of cancer cells and with the evidence for tumor-initiating cells. And it suggests that therapeutic cloning strategies using human ES cells or IPS cells should incorporate screening for PRC2 or these polycomb DNA methylation abnormalities. And it suggests that in some cases, cancer may start out as a differentiation defect due to epigenetic defects as opposed to necessarily a gatekeeping change in a tumor suppressor gene. This is somewhat provocative, and I'm not arguing, obviously, for a hereditary case. It's quite different, but I think that it's certainly in many sporadic cases this may be the case. I'm going to briefly, in the last few minutes, go over some really exciting data that is actually some of which is in press. And this has to do with looking more closely at where the CPG island hypermethylation and hypomethylation happens across the genome. This just shows you a blow-up of one CPG island that is unmethylated, each one of these little rectangles is a sequencing read, red is methylated and green is unmethylated. Here you see the normal colonic mucosa, and down here you see that this segregation between CPG islands and non-CPG islands has fallen apart, and you get encroachment of methylation into the CPG island. And at the bottom, you see the differential between tumor and normal. So one thing that Ben found, which was very exciting, was that these regions of hypomethylation appeared in contiguous stretches that were of megabase size, indicated at the bottom here by these stretches that undergo hypomethylation interspersed by regions of relative epigenetic stability. And if you plot 20 KB windows now, and the same kind of plot that I showed previously instead of these short focal regions we're looking at big range, longer distant changes, you'll see that the genome is really segregated into two compartments, one part that is called partially methylated domains, nomenclature started by Joe Ecker, and another part that is relatively stable. And if you look at where the CPG island hypermethylation happens, we find that this focal hypermethylation happens primarily within these regions of long-range hypomethylation. So what we really have is areas of epigenetic instability that encompass both the hypomethylation and the hypermethylation. And the most interesting finding we think is that we found, and this is shown here for the TCGA tumors, the partially methylated domains, and all the way at the bottom, we found that these regions coincide with regions that are late replicating and in the nuclear periphery. So Eric talked about how regions of the genome that are near the periphery of the nucleus and are late replicating have higher mutation rates. Well we now find that they're epigenetically unstable. So we really have this spatial view now of the genome. It's not a long linear molecule with all the chromosomes lined up either front to back or in a circle. It's actually a spatial structure and the regions that are near the periphery of the nucleus contain our lamin attached. They have late replication and they have epigenetic instability and the more actively transcribed regions interior to the nucleus are much more stable. So that's it in summary. I'm running late so I'm going to actually just keep this up for a minute and jump on then to the acknowledgments. I really want to point out Dan Weisenberg and Ben Berman who are two faculty members at the U.S. C. Epigenome Center who really helped drive this project so I'm particularly grateful to their contributions. But obviously this is a team effort with many people participating in the data production and analysis. And on the top right again my collaborators at Hopkins, Steve, Jim and Leslie. Thank you.