 Hi, thanks Dr. Sivit for our introduction and I would like to thank TCGA committee for inviting me here and giving the talk about our work on profiling long intergenic non-coding RNA interactions in cancer. The primary focus of this work is to catalog link RNA expression among TCGA cancer type. Basically we're extending initial profiling efforts made by Dr. Sanders' lab as well as Eric Larson there in profiling over in cancer link RNA profiling as well as MD Anderson group Dr. Han Liang there, they did the churro gene expression profiling among the non-coding RNA part. And more importantly using this kind of data we would like to leverage the TCGA data types from mRNA mutation and copy number to facilitate an integrative analysis that can help us to understand emerging gene-degradatory role of this link RNA is in cancer. And to ascertain whether this transcribed non-coding RNAs have any regulatory role or they are merely a transcriptionalize. Link RNA in general they are more than 200 base pair length and non-coding RNA without any coding potential. They do have full detail for most of the annotated link RNAs and they do show a piegenetic mark consistent with that of transcribed genes. Depending on the annotations they use there are more than 10,000 computationally predicted transcripts have been identified and 3,000 of them they show conserved patches among them. In terms of the functional levels in cancer I think it is very unclear as of now that how does a mutant protein like T53 or BRAF mutant protein they can interact at their target regulatory site with the co-regulated proteins. And so link RNAs they have been known to play such a facilitate such oncogen-driven downstream gene regulation basically they can act as a molecular scaffold and allow such proteins to interact together in the gene-degradatory sites. And this is evident by the recent publications by a role of hot air and other link RNAs in EZH2 chromatin remodeling as well as role of NREL and P21 in tumor suppressor signaling. However what is not known is how exactly these interactions are happening and so one of the theory emerging theory is there could be a sequence specific transfection sequence specific interactions of link RNA with DNA and adjacent RNAs. And this could be transposable elements of microRNA sequence or G4 quadplex within the RNA sequence that can help them to form the structure secondary strain loops that can help them form scaffold. And if there is such sequence specific structures do we find an enrichment of this link RNAs harboring such motive to drive downstream gene regulation in cancer. So with those questions in mind we are proposing this analysis outline. First we will quantify link RNA expression in TCGA tumor test as of now we are profiling in melanoma and prostate cancer. Second is to correlate this expression with existing expression mutation and methylation phenotypes. And third is to identify enrichment of the sequence specific link RNA DNA interactions at regulatory domains of cancer genes. Moving to the first part about the quantification so depending on the link RNA notations there are two primary source for this link RNA notations. One is from Broad and Encore that is what we use and we have also included a few link RNAs which were not in this databases which were part of the earlier analysis by Michigan group on prostate cancer. And so just to avoid the quantification bias regarding the RNA in RNA-seq experiments in non-stranspecific RNA-seq experiments we are excluding intragenic as well as overlapping annotated transcripts and we are just taking 75% of the intragenic transcript for this annotations. We have used cuff links as well as ST-seq by Eric Larson and group for prostate cancer and organic cancer link RNA profiling. So coming to the phosphorous analysis what we observe as seen in this figure from Eric Larson's earlier paper that most of the majority of the link RNAs which are expressed they have a polyethylene rich as compared to the coding RNAs. And however as I think compared to the other studies link RNAs also show very comparable low expression compared to the coding genes. This is in ovarian as well as this is in melanoma where the link RNAs are substantially low expressed compared to the messenger RNAs. Next I did 189 link RNAs which were most variably expressed within 327 melanoma samples and in this unsupervised cluster these are the known either mutation mass mRNA subtypes sorry mutation subtypes or the methylation subtypes and other characteristics from the TCG the melanoma AWG analysis and these are the link RNA expression profile and what we see is three distinct clusters. The first cluster is based on the subtypes based on the methylation as well as mutation signature and these samples are predominantly methylation normal like and among these this small cluster is which is highly expressed in this link RNAs they have triple wild most of the samples are triple wild type which is BRAF and RAS NF1 wild type samples as well as third wild type. In contrast the second cluster here has predominantly third mutant samples 30 not of 25 identifiers third mutant samples are here and in the last cluster this is predominantly CPG highland hypermethyr samples as well as beta wild type and so we look further into this 189 link RNAs and see if we can identify any link RNAs which were previously experimentally validators or something like that. In fact we find couple of link RNAs which tend to have rolling cancer specific regulations either modulating dog signaling or CD49 is interesting it's on the 5P15 regions although it's very far from the third promoter region but it's on the same locust there and it has a rolling and it's very it's upregulated in the same cluster where the third mutant or third samples are predominant and it has a rolling or signaling inactivation pathway. So next we did took this link RNA regions and ask a question whether this link RNA regions are enriched in the aggregative regions of cancer targeted genes like oncogen driven genes and we did the great or genomic region enrichment analysis and what we find is in fact the link RNA show significant enrichment for the close to the like proximity to gene annotations regulated by HoxA9, ERCC, FOXO and other cancer specific genes. In particular these link RNAs are more enriched in distal regular region either 5KB upstream or most of the time is between 5 to 20 KB upstream regions and possibly indicating that enhancement activity however we are still working on this thing and see if this is real or not. Next we did identification of sequence specific interactions if any within the regular domains of this cancer gene and so first parts like DNO or motive discovery reveals that this link RNAs may have potential role in transcriptional regulator and cancer gross signaling pathways as evident by this top predictor motives and they're binding to, they're binding mediating the GDP is gross signaling of in pathways among them there are RNA complementary motives binding to EGR, Zep1 and others and so besides this motive we were also interested if the transposable elements might play some role because they are most abundant across throughout the genome and we were just like curious about whether or not link RNAs are enriched compared to the rest of the genome and what we find is in fact 23% of the link RNAs transcript have at least one or more sequences in the coding region of the coding region of link RNA and this was further corroborated by the recent publication where they showed that the link RNA exonic regions are in fact significantly enriched for all the elements transposable elements including all the elements compared to the intronic counterpart as well as the other non-coding RNAs or protein coding genes and this possibly hinted that this sequence might play some role in RNA interaction which was only shown in one or two classical papers where they have shown that ALU mediated RNA interactions and also we did whether or not these ALU elements are enriched for the, so ALU has several sub families depending on the primary evaluation and what we find is that our preferential hits within only specifics of families like JB, ASX and YC specifically ASX which is the most recent expansion of ALU sequences along with the Y and specific to humans and that might hint again towards some regulatory role. Again, we are working on this kind of interaction and see if in terms of expression correlation it makes sense or not and so in summary we did the expression profiling for link RNA and melanoma and prostate. I have just shown here melanoma but in the synapse we would have planned to put all analysis there in synapse portal. Also we shown the differential link RNA expression is based on methylation phenotype as well as mutation signature and link RNA exonic regions they harbor ALU elements in abundance and they might hint that the possible role in sequence specific interactions. So ongoing task right now is to outline the functional relevance of this differentially expressed link RNAs and that is by creating co-expression network using this predicted link RNA microRNA as well sorry link RNA, messenger RNA and link RNA microRNA partners and to overlay the copy number alteration data and see if anything significant comes out of the copy number amplified or deleted regions and simultaneously we are making this data available in synapse and hopefully once the pipeline ready we will be happy to work in FIROs and see we can have a catalog of link RNA expression across TCGA tumors. So with that I would like to conclude and I would like to thank my mentor Dr. Linda Chen as well as my committee for the valuable insight my team FIROs team and TCGA working groups for giving all the analysis support and Eric Larsen in particular for working with me on process cancer data. Thank you so much. We have time for some questions. Allows are all over our genome. Do we know these allows were hot like if they were line one or something that we know that they're active but if they're all how do we know they are hot and they have significance in your data set? So for the only one looked in the aloo like aloo regions and exonic but the previous paper that showed I think they were showing that the line line elements are substantially enriched and then after the aloo elements and so that could possibly indicate that they are active there because lines are like ritual elements and they can jump in jump out but for aloo we haven't we haven't checked that part yet. Yep partly because allows are not active they always need some right they need to retro right they don't have the n-nucleus activity or they don't have retro art right they need they need l1 dependent right that's true okay but no we haven't checked that. I had a question Amir did you guys have a way to start looking at mutations and their correlation with expression we ran into this mallet one gene which is a link RNA and squamous cell cancers and so for melanoma for melanoma there are 39 whole genome samples we don't have a map files yet but what we did is was I think there are too many false positives and what I did is took only link RNAs which are unnoticed and run the variant calling either like let's say 20 kb ups up and downstream of link RNA regions there are variant hits across those which are expressed but again I can I'm not sure about the quality of the variants it's a speaker it's ongoing work yeah. Is there one more question okay well let's thank Amir one more time okay so the last speaker in our talk is Andrew Gross from the UC San Diego he'll be talking about multi-omics classification of head and neck cancer ties p53 mutation to 3p loss