 Great. Thank you very much. So first of all, thank you very much for the invitation to speak. And even before starting, I actually did want to preface just by two things. One, clearly listening to the talks over the past two days, you guys really are data users of TCGA. I think you have to look at me more of a data abuser. Yeah. So that's kind of one way to take this into account. The other thing I'd like to say also in prefacing the talk is we're trying to use the TCGA data in a very different way from most of the presentations that we've spoken about. So clearly the problem that we focused on is ovarian and breast cancer. Clearly I'm showing here the number of cases, which all of you guys know, the number of cases, the number of deaths, prevalence in the population. And at Mount Sinai, what we've been doing really, focusing primarily on gyna on cancers, we've set up a whole bioinformatic platform and biorepository to take samples in as they come from the OR, directly bring them into the lab, start blood samples, start cell lines, start animal models, collectives, with the idea that really as we talk about patients, and clearly so many of the talks are geared this way, we really do think about at the time of patients' diagnosis, their original surgery, chemotherapy, their progression, their overall survival progression. But what I wanted to talk about today was not this paradigm, but really more this paradigm, thinking about risk genes. And so clearly, being a pediatrician, being a geneticist, my clinic thinks about risk much more frequently. And so what we wanted to do was really take a different approach using the families that we'd collected, I'll show you a couple of pedigrees, and a different approach in finding cancer susceptibility genes. And so really the rationale for the approach is family-based studies to identify ovarian and breast cancer susceptibility genes. So as most people will know, family history is the strongest single predictor of a woman's chance of developing breast and ovarian cancers. While BRCA 1-2 mutations still represent the strongest known genetic predictors, they really are responsible for less than 50% of all families. And I can tell you from the clinical side, the clinicians are faced with this every day where a patient comes in, know that they have a strong familial risk for developing myriad testing or whatever testing now will come back as negative, and these women are really faced with really loneliness and just a confusion what to do next, as is the physician. So any kind of a study that can identify other susceptibility genes would be a real help in the clinic. So again, really what I'm talking about is more of a patient-centric or family-centric bias. But from a gene discovery standpoint, things become more evident on the population level, which are not evident when viewed in isolation. And I say this again where having families are always a great way to start doing genetic studies, but the problem is then you have a single family and then determining whether or not you have a rare private allele, or is this something that's more prevalent or frequent in the population becomes a little tricky. So materials and methods. So to start these studies, we actually did this in collaboration with two groups, Dr. Lilian Jara, who gave us 70 families with three or more affected. She's from the University of Chile. We also included a number of families with male affected, and Kunlea Dunce from Rosswell Park, who gave us 72 families with more than 72 affected. As a preliminary test, we started by sequencing 21 exomes. We did this level of coverage, and then we went through all the data, again, looking for potential susceptibility genes. And here's a representative ovarian cancer family. This is family 311. You see that there are three affected women, the age of onset 48, 43, and 25, and these three individuals. They all have parents or grandparents. They're affecting their clue. They're all related. Here's a breast cancer family, again, with a number of individuals. These were actually screened by myriad as being a BRCA negative. These are other individuals in the family. So here's just the average coverage per base for BRCA 112, because, again, the idea that we started with is that we should be looking for families that are BRCA 112 wild type to look for other susceptibility genes. So we tested the BRCA coverage. Indeed, one of our families, which had been screened as negative, actually turned out to be positive. These are families that had males in them. And then to start the analysis, and this is where our collaboration with the gene pool, and there's a poster outside, poster 55, that describes some of this work. So you can see we took these three individuals, and as we did the sequencing and we did the analysis, we had a couple of filters that we wanted to apply that rationally made sense for gene discovery in families. So we looked at the number of variants in genes, and then the filters that we started applying to the data. And this is actually kind of nice. The gene pool allowed us to do this kind of alluvian filtering. So if we ever wanted to go backwards, say either the filter was too tight or it's something that we wanted to change, we could go back in position. So we removed all the variants with allele frequency of greater than 1%. We kept variants that had either a higher moderate impact. We kept variants that were present in all samples. In other words, all three individuals in this particular family had to have the same variants, if we believe it to be Mendelian trait. And then we kept genes containing only one variant. So in other words, if there was a particular gene that had seven variants in it, we excluded that from the analysis. And so from starting from a large number, we came down to a more reasonable number, but still too many genes for us to do any kind of a functional validation. And again, we're coming from a different perspective in the laboratory than having, you know, hundreds of samples to look at. So what we did, we actually then manually curated for the variants that were likely to validate, the ones that were unlikely to validate either by low coverage or where they were in the gene. And doing that, we then ended up with 24 candidate genes. So these genes were shared between all three individuals. All of them were validated by Sanger sequencing. So now we had a little bit of a problem, right? Because again, each of these were equally likely to be the susceptibility gene. And this is where I'm going to ask my first apology from you, other than hopefully going over time, which I won't do. But I cannot include the names here because it's being broadcast. I can't show the names, but what I would say to everyone here, please, if you stop by poster 55, I am more than happy to share the genes here, the candidate genes, because I think that's important, but I just can't show them on the slide here. But what we did in collaboration with the TCGA and Sendeepsanga from Station X was able to get the TCGA data for ovarian cancer. And what we asked was not the mutation frequency in the tumors, but the mutation frequency in the germ lines of these individuals. We had 240 samples that we could look at, and we split those by BRCA status. Again, using another filter, we essentially said which of those sample sets we believed to have a BRCA mutation, BRCA 1 or 2, which were wild type. And then the filter, the last filter we applied was, we looked at the fraction of samples in each of the individual gene candidates and the chance that they had mutations in them in the germ line. So again, the idea was, well, if you actually have a very high fraction of samples that have lots of mutations in them, again, it's probably not a highly conserved gene, whereas if you're on this end of this trimodal distribution, you're more likely to have a functional or something that's being preserved. And so we went from 24 candidates by using then the TCGA data down to eight candidates. Again, using the wild type or the BRCA wild type in these 160 samples, we could look at the number of samples that actually had mutations, the same mutations as in our family, family 311. So five of the genes of these eight genes actually had the same specific variant in the TCGA germ line data set. And these were present in about 1% to 2% of this population. Two of these, when we then looked at the 1,000 genome database, were actually increased, so the frequency was actually increased in TCGA with a p-value relatively low, 0.02. So again, if we had more samples, again, we could probably increase the confidence in that. So another parallel line of support was we went back to that list of 24 genes that we had here, and then we applied a functional impact of mutations. We used mutation assessor Boris Riva, who was one of the developers of the program, actually helped us look at all of our 24 genes, and independently of the TCGA data went through and tried to score the functional impact. So seven of the variants were assessed as functional. One of them had a switch of function, which is this one here, so this is actually the DNA binding domain. It changed the proline to a serine. Three of the genes actually had involvement in cancer. When we looked at the overlap of our eight genes by TCGA and the seven genes here, four of them actually overlapped and stayed together in the final analysis. So we then looked at the pancancer. We took all those 15 genes between the mutation assessor and the TCGA. We looked at the pancancer to look at the distribution of mutations there. We broke it down then into the ovarian cancer data, so we could see one of these genes actually had about 6%, which we think is too high looking at the function. Again, I'm more than happy to share the gene name and its function later. And a couple of these actually had some interesting distributions, 0% here just because it's below the threshold of 1%. That one gene that we showed here that has the three mutations, this is actually the gene here. These are the mutations that are found. It's also mutated in breast. Again, interesting that Brock had mutations in both breast and ovarian cancer. And this is another paper that actually had come out earlier in the year of February looking the same kind of analysis, taking the germline TCGA data from the ovarian cancer group and essentially going through functional filters and coming up with a couple of new candidates that kind of bundled into these functional pathway of significance. What's interesting, I think, is that our final list of candidates don't overlap at all with these. I think the value of having some families, informative families, can still lead to new gene identifications. Here, just quickly running through this New York breast cancer, we selected three of the women for sequencing. We did the same kind of flow-through using the gene pool software. We ended up with 22 genes. Again, we then sequenced six of the individuals here of those 22 genes. Six of six women have shared one gene. Five of six have shared the same mutation in five of the genes. Then there was a distribution, four, three, two, one. Again, given that this is breast cancer and the population prevalence is one in eight women, one in seven women will develop breast cancer. In their lifetimes, again, we assume that some of these, they could all be familial or they could even be familial and a sporadic form. What was interesting when we looked at one of the top candidates and we looked at its potential clinical significance in overall survival in breast cancer, if you actually take BRCA1 and you plot the overall survival, this is what the distribution looks like. If you take one of those candidate genes, again, more than happy to share that, we get a very similar survival curve from there. So again, it's not functional proof, but it is interesting that that happens. So in conclusion, we think we've identified a number of high-interest aid candidate ovarian and breast cancer susceptibility mutations in genes. And this was taken either through a personalized approach within families and between families. The use of the germline TCGA data allowed refinement of the candidate list and really now the next step is validation. And if you hear a little tremor in my voice, it really is one because the functional studies now are really going to take up a lot of time. And so, you know, one of the thoughts we had again is if there's anyone that's interested again in looking through more of the breast or the ovarian families, if there's additional data to share again, we're more than willing to share because if we can trim down that list of candidates, it'll allow us to do more of the functional studies, again, generating mice, generating animal models, generating these mutations in cell lines to try understanding the actual impact on the biology. So with that being said, the acknowledgments, I just wanted again to acknowledge all the families that participated in these studies, a number of individuals, Peter Dutino is my collaborator at Mount Sinai, Slav Kendall, who's the postdoc in the lab, and I guess in the TCGA, as you say, the manuscript committee, so I guess Slav is the manuscript committee by himself, Boris Riva, who looked at these with a mutation assessor, individuals here at Roswell Park in Chile, and also Sandeep Sanger from Station X, who helped design and modify the software so we could do these studies. With that being said, thank you very much for your attention. Have you looked at the mutation? Is there any SNP associated with specific mutation? Can you use GWAS to validate your finding that we're much faster? Yeah, so there are. So a couple of genes actually are rare. They do not have SNPs, so let's say the eight that we pulled out, I believe, three have no associated SNPs, the others, again, in a European and, you know, population less than 1 percent. Again, that was part of our filter, but... Okay. Do we have the next speaker? Chai Bandlamoudi. Okay, here we go. Okay, the next talk is discovery and functional characterization of our current gene fusions from primary tumor transcriptomes across 19 human cancers.