 First, I'd like to thank the organizers for inviting me to speak here today. I'll talk about the project we've been working on for predicting time to ovarian calcinoma recurrence using protein markers. In this study, we mainly focused on high-grade series ovarian cancer types. Let me start with a little background. The standard treatment of ovarian cancer is a treatment surgery followed by platinum-based chemotherapy. But the initial response rate is pretty high, but 25 to 30 percent of the patients relapse within 12 months, and too much doesn't respond to platinum therapy. For the TCJ data set, roughly 30 percent of the patients were classified as platinum resistant. So there is a need of noble predictors of a platinum resistance that can lead to change in therapeutic approach. Lately, we developed a patient classification system named classification of ovarian cancer, clover. Using TCJ data, we identified the subtypes and survival gene signatures, and then we built a prognostic model. We found that clover significantly predicted overall survival. These couple of minor curves were generated after classifying patients into three risk groups using clover, and this was obtained from the validation set. The log-rank test was performed to compare overall survival in risk groups, and the p-value is very small for overall survival. But the prediction of progression-free survival is not as accurate as the prediction of overall survival. We found a very similar pattern in several other gene signatures from other studies. Those gene signatures strongly were significantly associated with overall survival, but not with the progression-free survival or treatment response. That motivated our work. You may wonder why the sample size is different for these two plots. Well, not all the samples had progression-free survival information. That's why we have a smaller sample size for this. In this study, we aim to develop a predict of a platinum resistance using protein markers. For this, we use the reverse-phase protein arrays, IPPA technology. I think most of you are already familiar with IPPA. It is a high-throughput technology for measuring protein expression levels in a large number of samples. Using our IPPA, we measured 172 cancer-related proteins and 4-sport proteins in 412 TCGA samples with serous ovarian cancer. Of 412, 220 samples were included in the model construction. Those samples had non-missing values for progression-free survival, and they all had advanced stage digits. In this study, we developed a protein-driven index of ovarian cancer. We named it PROBAR in line with CLOBAR. Let me show you a brief flow chart of how we constructed PROBAR. First we used 222 TCGA samples as a training set. In the next step, we applied LASSO, a statistical regulation method to the TCGA set. I'll talk about this in the next slide, just in a minute. Using the LASSO, we identified the nine protein markers that are most associated with progression-free survival, and we estimate the coefficient from the COX regression. After that, we defined PROBAR as a linear combination of the nine protein markers weighted by the COX regression coefficient. A statistical challenge of high-dimensionality arises whenever you predict survival using genomics or proteomics data. For the feature selection and coefficient estimation, we used the LASSO. The LASSO estimator is obtained by maximizing the likelihood function with a L1 norm constraint. The main advantage of the LASSO is it provides a parsimonious, very simple model by shrinking unnecessary coefficients to exactly zero instead of small values. By applying the LASSO to the TCGA set, we identified these nine protein markers and estimate the coefficients. You may notice that the coefficients are very small. Those are shrunk by the LASSO, and it is known that shrinkage may help avoid the overfitting problem. These are nine protein markers we identified, and these five proteins had negative COX regression coefficients, which means these were associated with better performance, better outcome. And indeed, these were overexpressed in the low-risk group. The other way around, these were associated with poor outcome and overexpressed in the high-risk groups. These little bars indicate which samples are platinum-registant, and we see more platinum-registant patients in the high-risk group. So we identified the nine protein markers. We defined that we computed pro-bar for each patient in this way. It weighted the sum, and then we classified the patient into one of the two risk groups according to pro-bar, and we used the median as a cutoff. Here are results with pro-bar. This is based on progression-free survival, and this is overall survival. And pro-bar was constructed using progression-free survival, but it is also predictive of overall survival. And again, not all the samples had progression-free survival data available, so that's why we have a smaller sample size for this. These were obtained from the training set, TCGA training set, and we need to validate pro-bar in an independent data set. We were able to obtain 229 high-grade serious-validation samples from Japan and Philadelphia. They were treated uniformly. And using RPPA, we measured the expression levels of 144 proteins and forceful proteins for those samples. And as you see, pro-bar is significantly predictive of both time-to-tumor recurrence and overall survival independent data set. And we tried out three groups' stratification, but we saw a similar result. For the purpose of comparison with a gene-driven model, we implemented the pro-bar in our validation samples with the gene expression data available. Well, the sample size is small because it's only subset of validation samples that had a gene expression data available, so the sample size, we are just using 130. Let me first show you the result with the pro-bar. No significant difference in progression-free survival between groups. But pro-bar, obviously, it improved the prediction of progression-free survival. Finally, we tried to test the robustness of the nine protein markers that we identified using the TCJ samples. To do that, we sort of in a reverse way, we used the validation samples to identify protein markers. And then we compare those samples, those proteins, versus the original nine protein markers and see how much similar they are. Using the validation samples, we identified these five proteins. And let's compare this with the original ones. Unfortunately, AL is the only overlap. But when you clustered proteins using hierarchical clustering, we found that some proteins shared very similar expression profiles. Let me explain this. These little bars on the first row indicate the location of the nine protein markers that we identified using the TCJ samples. And this one, the second row, this one is overlap. And these are the protein markers we identified using validation samples. And overall, our protein markers were spread out over the clusters. And interestingly, nine and five protein signatures included representative proteins from each cluster. So in conclusion, we developed a protein-driven index of ovarian cancer, probar, using progression-free survival. And we showed that it is predictive of both progression-free survival and overall survival in high-gradus ovarian cancers. And many genetic signatures in other studies often contain a large number of genes. But unlike this, probar is simple, but it's still predictive, making it useful in clinical practice. I'd like to thank my collaborators, especially Koski Yoshihara and Lou Burhawk, who is my mentor. I'd also like to thank Gordon Mills, Iling Nu, and all the other collaborators. Thank you. Sure. I'm curious that one of the common prognostic protein factors was AR, the androgen receptor. And I think that kind of goes back to the question that one of our colleagues was asking about ER and lung adenocarcinoma. What's known about the role of androgens in growth of ovarian cancers or other women's cancers? You mean, ER or? I thought you- no, AR. AR. Which person? Well, I'm a statistician. You don't expect that. That kind of answer. Well- Thank you for that great answer. Okay. Well, at least it is a significant statistically, and we found in many studies, AR is a significant predictor of ovarian cancer, yes. I have a question that's hopefully more up your alley. So I was wondering, you used an L1 regularization. There's a lot of evidence that mixing L1 and L2 regularizations, such as when using elastic net, you could basically also capture redundant predictors. Did you try anything like that, A? And B, if you did not, another trick is to sort of remove those first nine predictors that you got with your L1 regularization and provar, and then see what's the next best group of predictors for basically predicting survival. Did you try any of those things? Are you talking about relaxed lasso stuff? Yes. We haven't, but probably we can do that later. I mean, I would suggest either doing elastic net or seeing what's the next group of predictors, because that would be interesting from a biological standpoint as well. Yeah, we'll look into it. Thanks. Anderson Baylor. I was wondering if you had looked at the patterns of androgen receptor co-activators or co-repressors in ovarian cancer that may correlate with your patterns of expression. Androgen receptor antagonists like flutamide have a notoriously low response rate in ovarian cancer in the range of five to eight percent, despite the fact that there are studies out there that suggest that the androgen receptor itself is expressed in about 70 percent of ovarian cancers. So I was wondering if you had looked at other factors that may be playing into your observations. We considered clinical factors, such as age, stage, grade, surgery status, and also BRCA1 and 2 mutations. And those are significant, actually. But we considered multivariate, multiple cox regression, and probar was the most significant one across the dataset. Four of your nine antibodies are phosphoantibodies. So could you please comment on the potential technical bias related to the different ischemic time? We haven't done that functional analysis. We should pass away our functions in four clusters. Probably we'll investigate further. Thanks. Okay, thanks.