 I have a series of three project updates, the first of which is from Brad Ozenberger, Program Director at NHGRI, and it's on the Cancer Genome Atlas Project. Yeah, hi everybody. In preparation for this, I went back and looked to see when we last gave an update on TCGA for you, and it's been almost three years. I went back and pulled the first slide from last time, which was this, when President Obama visited NIH in September 2009, just three and a half years ago, to announce biomedical research investment through the American Recovery and Reinvestment Act, and that included naming TCGA and NIH signature project, and a big bolus of investment, $175 million, and I guess I'm here now three years later to see, I think that was a pretty good investment, I think we've used it well, and I'll try to describe that. So just a quick reminder of what TCGA does. The Cancer Genome Atlas, our goal is to do comprehensive genomic analysis on all the major tumor types in the U.S., and a real key to this is to take each tumor type, each specimen from each research participant, and do all the analyses we can on that tumor specimen. So doing exome sequencing, RNA seek, microRNA methylation, and even more recently added some protein analyses, all on the same tissue samples from these participants. To do this requires quite a pipeline that's been built up over the years, a biospecimen core resource that processes all the samples, collects the clinical data, six genome characterization centers, the three genome sequencing centers supported by NHGRI, we've added genome data analysis centers, and then a large effort to coordinate these data. This is what it looks like on a map, just to remind you, NHGRI's investment in TCGA in terms of funds is through the large scale sequencing program, the three big centers, the Broad, Baylor, and Rick's Center at WashU, and that's our financial commitment to TCGA is through their genomic sequence and analysis. Just kind of go through a bit of history and where we're going. TCGA, the genesis of TCGA was from a report from the NCI advisory board back in 2005 that proposed actually a number of, they were predicting a number of technological advances that were in the works, and actually just part of this report was to design and develop a large cancer genome effort with, it was suggested with NHGRI, and we set off on this program really in 2007 to pilot this, starting with glioblastoma and ovarian cirrus carcinoma, and we knew we had to establish the infrastructure, a pipeline, and feasibility, and we started this, Rick's here, with capillary electrophoresis sequencing, but we all knew that we were going to hit this point. The next generation sequencing was going to come on and really make this project feasible. Of course, with the reduction in costs, if you've probably seen this graph before that is on our website, of course also there's a great increase in capabilities. So although this was coming, we started with, before NextGen kicked in, and actually this first GBM report was with just a small gene list with capillary electrophoresis sequencing and analysis and still using gene arrays. But then shortly after this, the project expanded. We're now about just past the midpoint of the main TCGA program. We're now up to 25 tumor types in TCGA. Sample acquisition was really beefed up at the NCI, at the beginning of the expansion. We added these genome data analysis centers, which weren't part of the pilot phase. It was recognized we needed both. A lot more horsepower in terms of analysis to do the integrative analyses of all the data, as well as lead in a lot of innovation in genomic analysis methods. And a major product of TCGA that really only started recently are the large benchmark papers that I'll go into a little more detail about. But the goal here for TCGA has been to achieve greater than 10,000 cases examined, and we fully expect to meet this goal by the end of 2014 in more than 20 different tumor types. In fact, we are now beginning to think about what happens after this big, the major phase of TCGA, beginning to look what happens after 2014. And I'll touch on that a little bit at the end. So just point out the TCGA network papers. These are each, really I think of them as historical data sets. These are deep into each one of these tumor types. Our goal is to describe mutations that are found in these each tumor type down to a frequency of 2 to 3%, requiring approximately 500 tumors of each type. Again, we started early on with glioblastoma. Again, this was a pilot phase that did not involve next-gen sequencing, but then that was followed in actually just the summer of 2011 with the integrated analysis of ovarian carcinoma. And this was the big shift. This involved, instead of a small gene list, this involved full exome sequencing, RNA-seq, and of hundreds of samples of ovarian cancer, and kind of set the standard then for where TCGA has gone since. Last summer, published the colorectal cancer work, the colorectal analysis. In September, the genomic characterization is squamous cell lung cancer. And then the next week, actually, the comprehensive molecular portrait of human breast tumors, the one that Eric mentioned in his director's report. Again, each of these papers really get a lot of deserved attention. These are, in each case, novel discoveries. And what I particularly, I'll come back to, what I really want to go into also is some of the clinical ramifications of each of these papers. So here's where we are today in terms of project-by-project view. On the Y-axis are the total qualified cases of the number of tumor cases that have been analyzed or are in pipelines. Again, our goal is 500. Breast, we put all the subtypes into a single project, and our goal there is 1,000 for the breast project. The ones in red have gone through a full data freeze. Either they've been published or a number of them have written, the papers have been written on our currently under review. And then there are a number of other projects now in the pipelines that are, we have full-fledged analysis groups working, and papers should be coming out later this year. And then there's this tail where accrual is still going on, and these will come, probably some of these in 2014. A number of projects, the ones start here have closed. The accrual has closed because we've exceeded the 500 goal, although we are still accepting African-American specimens to fill out some of the diversity. So all this can be found on a project dashboard, and I would point you to on the TCGA website, a project, we call it the Project Case Overview Dashboard, that gives a snapshot of all the data available for each project. Each project is listed. You can see here the number of samples that have been accrued, the number that have qualified and entered analysis, and then for every single data type, all the rows, this is just a small corner of the dashboard, every row represents a different data type, and you can see in there how much data of each type are available for that tumor project. It's quite a handy overview of TCGA. Just the top line numbers, we have now about 7,500 on our way to that 10,000, greater than 10,000 goal cases are in the bank, qualified, and most of these are at centers at this point. Greater than 6,000 cases with a full genomic data set, so this is 6,000 cases with full exome sequencing, RNA-seq, microRNA-seq, methylation, and clinical data, as much clinical day as we can get. Again, and I point out also hundreds of whole genome data files, this number continues to grow, and of course for every case, this is always in cases, for every case for the genomic sequencing, it's both the tumor genome as well as the normal genome. And right now on this dashboard, you'll find there's data available in our database on 25 different tumor types. So TCGA was set, our goal is to create a community resource data set. This would be, data would be released very quickly and then used by the community, as it is of course, but we also have a large TCGA network that also has worked very hard to integrate the data and provide first look. So things like cancer stratification by gene expression or methylation patterns, you know, every tumor type there's a list of significantly mutated genes and how those mutations are distributed across the cohort. Whole genome looks at individual whole genome data, this is a particularly scrambled lung squamous cell carcinoma genome, and then all this is then integrated into a look at the pathways involved in each of these tumor types and together. So although the goal and certainly thousands of people each day are digging into the TCGA data sets, our own network is doing a lot of work as well, but I think we didn't fully anticipate when we started the program how quickly data would translate to potential clinical utility. And I just want to briefly go through a few examples. There's just such rich data and as we learned, as the group learned to integrate across all these data type and really build a picture of what's of the foundation of the genesis of these cancers really reveals something that can translate right to the clinic in many cases. So just to go through a few of these quickly, in GBM even that very first paper early on, there was an interesting example of many GBMs that show hypermethylation of the MGMT locus and these tumors require resistance to standard care therapeutics and the TCGA data explained how this occurred through a shutdown of mismatched repair pathways and immediately suggested changes to the treatment regimen for patients with recurrent GBM tumors. The ovarian work, it was known in ovarian cancer that the FoxM1 transcriptional factor network was frequently mutated, altered, but now with the full TCGA, with hundreds of cases, this was a very high percentage, 87% of tumors showed some alteration in this pathway, not always in the FoxM1 gene itself, but all these additional nodes that feed into it, suggesting perhaps a common target for ovarian cancer. But on the inverse, also the TCGA group identified the full spectrum of frequently amplified genes were delineated. These are number in the dozens, but of course each individual tumor has a different gene or two genes or three genes that are amplified and would be predicted to help drive the disease and really points to the fact that we need to customize treatment for each individual tumor. Colorectal first, colorectal started as two projects, colon carcinoma and rectal carcinoma, but it was quickly confirmed that in fact the molecular genomic underpinnings of these diseases show that it's a single disease, so we immediately merged these into the colorectal project, but it's just one disease. And the integrative analyses showed again similar to the FoxM1, so an ovarian, the prominence of the Wnt signaling alteration and promise of inhibitors in this pathway. Breast tumors of the basal subtype were found to have the same genomic signatures in a large sense as the ovarian serous tumors. These are poor prognosis, aggressive tumors, and we can see in this shows copy number data, ovarian versus the basal over here, you can see the similarities in copy number, but not just in copy number, but other genomic analyses as well, you could see this similarity. And already ovarian clinical trials are being adjusted to test these compounds for efficaciousness also in breast basal type tumors. Importantly also in this paper, the clinically defined HER2 positive tumors, it was known that there's always a substantial proportion of HER2 positive tumors that don't respond to the normal EGFR inhibitors. And in fact, in closer analysis of the TCGA data, they could easily divide the HER2 positive into two different genomic subtypes, and one that is predicted to respond to the EGFR inhibitors and one which wouldn't and shows an important marker that would adjust the therapy for those patients with that marker. Lung squamous. Lung squamous cell carcinomas are over 25% of lung tumors in the U.S., but in fact, it's been very, rather poorly described genomically. So this was the first real hard look at genomic, the genome of lung squamous and identified a number of interesting targets. Importantly, also it identified markers that showed similar underpinnings to lung adenocarcinoma and speaking with a clinical trialist in lung cancer, that they were immediately going to test some of their compounds from a lung adenotrial and lung squamous that have the appropriate mutations where it suggests it might work. A couple of papers that aren't out yet, but will be shortly, Kidney Clear Cell Carcinoma. Again, this is one of these cases where it's known that swycinif chromatin remodeling complex is sometimes mutated in these genes, but again, in the TCJ data, we can now show that this is the majority of these tumors, in fact, and there's a lot of interest in therapeutic compounds that modulate this pathway and potentially modulate this disease. And then endometrial, a bit of a unexpected finding, 25% of endometrial tumors, again, share this hallmark, these markers of ovarian cirrus carcinoma. Here now, just like the previous slide, cirrus ovarian tumors in copy number, the basal breast, we mentioned, and also these endometrial subtype, we call it cirrus-like now, and these are associated with an increased risk of reoccurrence, and now we have a better handle from this work on what the genetic mutations are that drive this. So I just want to give those few examples, and it's although, again, it wasn't our first goal to get these data right to patients where it might make a difference, but certainly it's happening on a more rapid timeframe than I might have expected. So clearly what TCGA is driving at, of course now, if you, the cancer diagnostics, it's mostly a path from pathologists reviewing slides, of course this will still be important, but certainly we're starting to see now the increased emphasis on genomic analysis in oncology, companies like Foundation and New York Genome Center and others, and TCGA is really providing a lot of the foundation to drive this personalized therapy, I don't like that word, but individualized therapy in cancer. So looking forward, again we've had a number of papers that came out last fall, colorectal breast lung squamous, coming soon, these are under review, kidney clear cell endometrial and AML, and a number of other projects that I would hope would be out before the end of the year, and followed by some big ones such as prostate melanoma that will follow that. TCGA's created this Atlas of Mutations, really I think been successful in understanding, beginning to understand biology of cancer through this project, this compendium Atlas of Mutations that drive these cancers, new drivers have been identified, and like I said already changing clinical practice in some of these diseases. Also I don't think anybody would argue that there's now firmly established that we need to think about each patient's tumor as a unique disease, and I'm happy to say all the major pharmaceutical companies have pipelines into the TCGA data now and are using these data on a continual basis to drive therapeutic advances as well. I want to point out that it hasn't just been about the biology of cancer that I think is part of TCGA success, but also the driving of technology. The poll of the TCGA program has driven the development of cancer genome analysis methods, this is a real flagship project, but many new analysis and informatics tools adopted are being adopted to all fields of genomic research, of course not just applicable only to cancer. So in the next phase we are just a couple years out and on a good projectory, but we do think the TCGA will wind down, there'll be some final analysis certainly for a year or two afterwards, but Eric mentioned a workshop that we had in the end of November I think, and NCI and NHGRI are working closely together and separately to develop some new initiatives. Certainly we want to continue this approach, there's still even with TCGA there's still many mutations out on as we go deeper into these tumors that aren't fully explained and certainly there still needs to be some atlas development in cancer. And then more importantly I think we're looking hard at moving more towards the clinical trial area to begin to investigate now the genetic underpinnings of for example metastasis and response to therapy that's going to require us to really get a little closer to the clinical trial areas to get these specimens and get these data. All right, so with that I'm going to close just acknowledge Heidi Sophia and Lindsey Lund work with me every day on TCGA and Mark Geyer is still a real key part of the team and I want to acknowledge Jane Peterson and Peter Goode who were involved in many years of the early stage and this is the the NCI team. They have a full office for TCGA led by the Dynamo Keneshaw if you've encountered her and then they have the new Center for Cancer Genomics at NCI that we're working with co-directed by Stephen Chanick and Lou Stout. With that I'll stop. Any questions? Yeah, Jill. Brad, do you want to say anything about how the ICGC project compliments TCGA and what they've done so far? Yeah, yeah I neglected to mention that. TCGA is a major player and a major part of ICGC it's the bulk of the data in ICGC. Yeah we've always we were kind of in front of them of course but we've been very pleased to see that a lot of large projects in Spain and Italy of course the the in the UK have been catching up and and contributing greatly. We meet at least once a year and there have been a number of coordinated efforts in certain tumors. Prostate is an example where one group looked at very tumors that only occur in young men and somebody else is looking at tumors that are a refractory to therapy and so we've done a good job of synergizing across that consortium and I think it continues to be something that we continue to be very important. There's a they have their own database run out of the University of Toronto, the Ontario Institute for Cancer Research with Tom Hudson and we work very closely with getting TCGA data into there. Yeah Mark. Yeah I just wanted to add on your point about community involvement that the analysis groups have become much bigger than any of the TC than the TCGA funded groups that project has been really good about bringing in wider participation by the community in the analyses so I don't know if you want to amplify on that. Yeah so each around each of these tumors of course a big analysis group forms we designate a PI within TCGA to kind of be a leader and then usually there's a disease a specific disease expert too that they kind of co-chair the analysis but then we invite experts in each disease to come in and contribute and so yeah if you know of people who are interested in a particular tumor on the list please have them contact us and we can get them involved. Yeah Polar. This is just an informational question from somebody who hasn't been keeping up with TCGA. What's the difference in the work between what the analysis centers do and the genome characterization centers are the genome characterization centers mostly about structure is that? No they're data generator so the genome characterization centers are data generators so they're doing the RNA analyses, a SNP chip array, the things that aren't done by bulk genomic sequencing and the genome data analysis centers are strictly computational yeah. I know that TCGA has had methylation for example an epigenetic mark is part of the program but I wonder whether or not there were plans or discussion about including histone modifications I mean many of the genes that identified a number of them have been turned out to be epigenetic modifiers in some way and I'm wondering if there's any plan certainly ACR has been talking about this was a workshop report on trying to gather groups together and with an interest in supporting epigenetic analysis of those same tumors which I think would add another dimension to the data. There are residual tissues that remain in the bank and we actually want to try to make those available although there isn't a spec sheet on the website on how to do that yet or we don't really know yet but but there we have begun some protein analyses mostly in the in you know phosphoprotein chips and that sort of thing but yeah right now the histone modifiers are really not part of the project. I'm intrigued by these immediate clinical translation findings and I'm wondering as those are discovered and as they're going presumably to clinical trials with maybe existing therapies but new indication are you using the infrastructure that you've got for TCGA to analyze pre and post tumor given treatment today or is that an opportunity that one could grasp? Yeah I think it's an opportunity it's certainly NCI is making a lot of movements towards making all their clinical trials genomically enabled um but yeah really that's more in in NCI's court and they certainly see the value of that but but yeah we're it's kind of making steps incremental steps towards that but TCGA itself though those those are all de-identified those samples. I understand it was more infrastructure you've got the teams for analysis for data collection for standards I presume. Yeah there's a lot of that that's actually a point for looking at the future is yeah we realize we've got this big infrastructure built and so those are some of the sort of things we're looking at now to see if we can build on that and take advantage of it. Yeah, David. Another forward-looking question you mentioned metastasis and treatment resistance as possible themes for future phases and could you just clarify for me what the degree of stratification of the tumors that have already been analyzed is if you say 500 for a particular cancer type is that all primary or does that already include a mixture of primary metastatic failed to respond to treatment? These are TCGA's all primary tumors. There are a few cases where we have additional samples from the same patient but these are all primary so we did not design it in a way to really go after those questions. And how about the issue of tumor heterogeneity within a primary? What's the level of multiple analysis of what might be a tumor looking like mini tumor? Again we have a few things we're actually talking about doing a pilot in that because we have tissue cases where we can know we're at least millimeters apart maybe a centimeter but no there's been we really don't have didn't do the accrual in such a way that we can take samples that are far apart geographically or anything like that so it's we actually do the heterogeneity simply through one sample and going deep into the sequence to try to understand it but that's all. Yeah there's more and more of that popping into TCGA as we figure out how to do it so the AML data set for example there's extensive analysis of heterogeneity in all of those primary tumors. There's also a number of samples breast I think where there are trios or there's a primary tumor adjacent non-malignant and a blood normal to get some idea of what we see in the adjacent tissue in terms of new mutations. The so-called field effects right so it's in there and I think it's maturing along with our ability to really do those kinds of analyses. But some of this I'll have to wait till the next phase. As you're going towards that I think some both the last two questions are heading towards some of the technical challenges. You mentioned there was some technical development but for example over the last 11 years we've been sampling blood and when possible tumor from all of the NCI clinical trial studies for the cooperative group I'm involved in. We have blood on almost 90 percent of the patients but 40,000 patients worth. We have fixed tumor on about 30 percent and we have fresh tumor almost none and it's not because of trying it's because of the culture of way tissues handled not necessarily that's a bad thing but just that so you could either don kihote and try to get people to freeze the tumor and that'll come eventually or you can really push on the technology for handling the the fixed stuff and all that. I know every center has their magic way of doing it but I'm not sure that any I believe any of them including the ones from our own center. TCGA took an attitude of no platform left behind so if we don't get good quality RNA from a tissue that tissue doesn't qualify that sample doesn't qualify for TCGA but for example we've done now a lot of work with FFPE tissue and of course the the sequencers can can do a pretty good job of getting exome from those sometimes the RNA is much more difficult but so we're looking at now you know in the next iteration you know maybe sometimes we don't have the RNA data or it's not as good quality and try to do it anyway but yeah we really are looking at FFPE tissue is being very important for the future. I probably need to cut this off sorry around no it's okay Brad will be it's very interesting topic fabulous work Brad will be around if people want to follow up so Simona could you please come forward