 First of all, thank you for the opportunity to share some of the work that we've done in my lab with you. My lab is interested in the genomic and chromatin consequences of specific genetic mutations in cancer. So because most of you probably don't think about chromatin and epigenetics on a daily basis, let me just introduce that to you. So chromatin essentially is the complex of DNA and its associated proteins as they exist in the nucleus of a cell. And that really involves, here we go, that really involves the modification of DNA, DNA as it's wrapped around octameric histones and then formed into higher order structures. And through the generation of these higher order structures essentially leads to specific regulation of the genome. So we know there are a number of ways that chromatin is regulated from a class of molecules that have been termed writers, editors, or sometimes erasers, readers, and then remodelers. And these exist as enzymes in the nucleus for both DNA and for nucleosomes or histones. So in the writer component, the canonical would be DNA methylation. And then for the nucleosomes, there's a large number of enzymes that are involved in post-translational modification of the histone tail, including things like acetylation and methylation. The editors essentially will can change those marks or remove those marks. Readers read those marks, essentially interpret those marks into the transcriptional machinery. And then of course the position of those nucleosomes themselves have significance. And the remodelers, their role is essentially to move them around the genome. So over the last several years, beginning with cytogenetics and now in the era of high-throughput sequencing, mutations in chromatin regulatory molecules have been identified across a wide range of tumors across the age spectrum. In clear cell renal cell carcinoma, there have been a class of mutations that have been identified through a number of publications that are frequently mutated in clear cell renal cell carcinoma, including those remolar, PBRM1, which somebody mentioned a minute ago, as well as histone modifiers, and other nuclear enzymes. So interestingly, three of these, PBRM1, CET2, and BAP1, which are all the most common mutated in clear cell renal cell carcinoma, essentially coexist on the short arm of chromosome 3, which has VHL at the very distal end of it. So there is often loss of the short arm of chromosome 3. So we proposed a project several years ago that we embarked on to study the consequences of mutations in these chromatin regulatory proteins across clear cell renal cell carcinoma. So we essentially gene a type for about 250 mutations in 36 tumors and eight matched uninvolved tissue, as well as performing FAIR. FAIR is a simple and chemical method for identifying nucleosome deficient regions of chromatin, which essentially is all the regions that are things like promoters and enhancers and early replication origins. So it's an unbiased way of sort of filtering the genome. We also performed RNA-seq for all these samples, and then we performed an integrative analysis. And this work was primarily performed by Jeremy Simon and Kate Hacker, and actually the paper just came out yesterday in genome research. So the first thing we did is we asked, does the application of chromatin accessibility studies in FAIR do anything? And we actually make any heads or tails out of the results. So we've been essentially across the genome in 500 base pair of windows and looked at FAIR signal. We identified those windows that most discriminated between tumor and non-tumor. And here in this heat map where every row is a window in the genome in every column is a tumor or normal. You can see there's cluster 3 identified sort of a normal specific cluster, whereas cluster 1 was sort of the most diversely spread through all the tumors. So we focused our attention initially on cluster 1. And we looked at the genes that were nearby and associated them with regulatory ontologies. And what we found actually to our great, I guess shouldn't I say surprise, but is that it identified with very high statistical significance, essentially HIF regulation across multiple data sets. Cluster 2 did not identify any similar pathway, actually. We then looked at the DNA elements in here. We found the HIF regulatory element that was highly prevalent. And then using chip seek data from another group that had studied in breast cancer, we identified a strong association in cluster 1 but not cluster 2 and cluster 3 with HIF 1 and HIF 2 binding. So again, for this group, that's going to seem pretty predictable that we identified HIF. But remember, had we known absolutely nothing about HIF or VHL when going into the study and just looked at chromatin accessibility, we would have landed right on HIF. So we thought that was actually pretty neat. So now we turned to the chromatin regulatory molecules that we were really most interested in. And we focused on set D2. So set D2 is a large protein. It interacts based on its carboxyl terminal tail with the CTT of RNA polymerase. And it essentially is dragged along by RNA polymerase. The set domain is the enzymatic machine, essentially, that places this very specific mark. It marks the 36 lysine and it puts a trimethyl mark on it, essentially trimethylates it. So we looked at mutations in our data set and we identified a number of mutations of high and moderate severity that spanned the set D2 molecule. And this really matched what other people had found in other deep sequencing efforts. Then we turned to tumors themselves and we made a tissue microarray and studied that specifically for the H2K36 status. And you can see here, compared to matched kidney where you can see nice, strong nuclear standing for H2K36, as you predict, there were some tumors for which there was similarly nuclear staining and many for which there was a deficiency of nuclear staining for H2K36, indicating K36 loss. So we quantified that and saw that, again, the normal sort of clustered here, but we saw a fairly wide distribution of tumors. And then if we look at the ones that are sort of below normal, we identified a class of H3K36 deficient tumors. And again, it was good reassuring to see that those tumors with the mutation, specifically the high severity mutation, had the greatest loss in H3K36. But we did identify other ones that did not have a detectable mutation. Some of these we found had low protein, and other ones had no expression of RNA. So now we went back to the method that I explained to you earlier, and we said, can we use now H3K36 status or 62 status to drive the identification of regions of the genome that have a different chromatin status? So again, we made windows across the genome and we allowed them to self-segregate based on their 62 status. And we identified those regions that were essentially the most different. And we identified, as you can see, about 7,000 regions across the genome that met our status. Interestingly, the largest group here demonstrated an increase in chromatin accessibility that was associated with the loss of set D2. So essentially, the loss of this mark came along with nucleosomes that were essentially less stable, where the chromatin was essentially more open, if you will. So then we turned our attention to the RNA. So we had performed RNA-seq using ribosomally depleted RNA. And we did that intentionally so we could look across all the spectrum of RNA and not focus exclusively on mature, processed RNA. So for those of you who aren't familiar looking at RNA-seq data, essentially, when you line up the tags, what you usually see are these peaks here. And the peaks correspond to exons, because that's what's preserved in the mature message as the intronic regions are spliced out. In polyadenylated messages, you should see virtually no intronic signal. But here in the ribosomally depleted RNA, you can see some signal. But what caught our attention was that when we looked just really visually and compared the H3K36 normal tumors, we identified clear signal over the exons. But that was gone in many of the genes in the H3K36 deficient tumors. And then we noticed that you'd have these regions like peaks of signal outside of the exons, basically an intronic sequence. So we essentially derived a score that we came up with called an intron retention score that essentially ran from 0 to 1, where 0, all the signal came from exons. And a score of 1 would essentially be represented evenly. And what we found is that for the H3K36 deficient tumors, there was a very strong enrichment in genes for which there was a high intron retention score. And this extended across thousands of genes, of course, with various degrees of significance, but really affects a very significant fraction of the transcriptome of CCRCC in the H3K36 deficient tumors. When we flipped our analysis and we looked at it based on PBRM status, this association was completely gone. So it was very specific to H3K36 status or set D2 status. Now, as I mentioned, this is done in ribosomally depleted RNA. So it gave us a view of RNA from the time it was synthesized or in the process of synthesis all the way through the end. So for validation, we looked at the TCGA dataset. So this is a much larger dataset with, at that point, about 400 tumors in it. Now that's different in that it's polyadenylated. PolyA selected. So this is looking at mature message. And we asked the question, is there sort of a shadow of this that persists all the way through into mature messages? So we had to use a different tool because our tool, which really looked at introns, was not really, is not useful, essentially, for polyadenylated RNAs. So we used a tool called FDM, which is developed by the computer science group at the University of North Carolina, that looks at sort of the break points at the end of every exon and the beginning of every intron and looks for statistically significant changes in the utilization of each of those points. And then what we did is we essentially looked at the mutant tumors versus the normal tumors and randomized in a control set and looked essentially for the manifestation as alternative splicing, intron retention, or alternative TSS and TTS, transcriptional start site or transcriptional termination site utilization. And what we found is that for, again, for a large number of transcripts here, probably near 4,000 transcripts, we saw a statistically significant change in the RNA. And it's dominated in this case by alternative splicing. And now that makes sense because really aberrant messages should be filtered out during RNA processing. So what you'd expect to see when you purify mature message would be essentially alternative splicing that sort of persists through. So we saw that as the significant sort of phenotype or the major phenotype in this experiment. So then we turned our attention back to the chromatin to wrap this background to the beginning. And we said, well, what's going on at the level of chromatin and is there any evidence of what we're seeing? So here we're looking at essentially a raw, fair signal. So again, nucleosomal signal. And I've overlaid H3K36 trimethylation chip over it. So these are just the normal tumors. And what you see is, and sorry, and then what I've done is I've aligned every exon that was mispliced together here. So really mainly focus on this box here. And what you see is we, you'd predict is H3K36 essentially is enriched right over the exon as it should be. And this correlates with a nucleosome that sits right over the exon. And then in fair space, what that looks like is a dip right here. And that was actually, you could even see that in some of these random internal exon start sites. So the phenotype was evident sort of in lots of exons. Okay, so now what we did is we compared just fair signal between the deficient tumors and the normal tumors. So in the bottom here is essentially the same plot as this one here, you can see this drop. But what we were amazed to see is that in the H3K36 deficient tumors, there was a strong peak now that preceded immediately the exons here. And so what this signifies is essentially a weak nucleosome or essentially a deficient nucleosome that just precedes the exon position, okay? So we were surprised to see that. But remember, and the other thing that we were so surprised is this is looking at primary human tissue, which again, applying this to primary human tissue and seeing these kinds of differences was really remarkable to us. So it's in conclusion what I've hopefully I've been able to show you today is that chromatin accessibility offers an approach to study epigenetic or chromatin changes in cancer and really in a non-biased way. I've shown you that 72 mutation and more globally H3K36 trimethylation loss leads to large changes in transcription globally. That 72 mutational status actually underestimates H3K36 trimethylation loss. And that's important because like the TCGA doesn't have fair data in it. So all we could look at for our TCGA analysis was 72 mutational status. And finally, that exonic chromatin accessibility is altered in the absence of H3K36 trimethylation. And what I didn't write down here is I hope this also tells you that we think that chromatin and epigenetics may be a new biological target essentially for therapeutic intervention. And with that, I just want to acknowledge the people who did the work, primarily Jeremy Simon in my lab, as well as Kate Hacker in Kim Rathmell's lab.