 Dr. Henry Rodriguez has provided a very broad and comprehensive overview of need for proteogenomics research. In today's lecture, he will discuss about two more latest initiatives, International Cancer Proteogenome Consortium or ICPC and Apollo Network. The cancer moonshot program is a very ambitious initiative from US government and now it is also expanding with various other international countries to build the networks where all the countries could start sharing the data which could accelerate cancer research. These initiatives by Dr. Henry Rodriguez are not only accelerating the cancer research but also helping in worldwide data sharing. Dr. Rodriguez will talk about Apollo Network which has applied proteogenomics organization learning and outcome. The emerging field of proteogenomics aims to better predict how patients will respond to a given therapy by screening their tumors for both genetic abnormalities and protein information. An approach that has been made possible only in recent years due to advances in proteogenomic analysis. Dr. Henry Rodriguez will demonstrate how cancer moonshot can accelerate the cancer research, how it can make more therapies available to more patients while it will also improve our ability to prevent cancer and detect at an early stage. Lastly, Dr. Rodriguez will talk about NCI Genomic Data Commons. The mission is to provide the cancer research community with a unified data repository that enables data sharing across cancer genomic studies in support of precision medicine. So let us welcome Dr. Rodriguez for today's lecture. So CPTAC today is not the only one that begins to blend the worlds of genomics with proteomics. Two other programs really have come into play and a lot of it is attributed to the cancer moonshot. One of them is referred to as Apollo and the other one is referred to as ICPC or the International Cancer Proteogen Consortium. And again the part that is quite nice about all these three is that now they do they blend these worlds together but everything that we produce genomically, transcriptomically, proteomically and from an imaging from the pathology suite and from the radiology suite we place it in the public domain. So that said how are these other two programs and what is their relevance Apollo and ICPC that comes out of the cancer moonshot. So the cancer moonshot is an interesting program. This is something that was launched in January of 2016 and at the time that was being led in fact the inspiration was the former vice president of the United States Joe Biden. And the part that struck me the most was the simplicity of the overarching goals that the cancer moonshot wants to achieve. And that was accelerate progress in cancer. In other words if something would take ten years can you condense it down to five years. There's many ways that you could achieve that to be quite candid. But the last two are things that the CP type room within the US has been doing now for years. And one is greater cooperation and collaboration to be very clear that's not simply collaboration within your own laboratory with your colleague next door on your lab bench or with another person in another lab within your institution that really was implied on an international scope which is what they were trying to achieve. The other one was data sharing. Make all the data that's really deemed pre-competitive and place it in the public domain as a way of accelerating progress in cancer research. Now when this actually came out then I actually then was asked by the White House Cancer Moonshot Task Force to come up with some ideas behind it. The Cancer Moonshot is much larger than these two programs but one of the things they asked was individuals hey can you come up with ideas that we could see that might be interesting to develop. So what started going through my head was basically taking this idea of what CP type has been doing with the National Cancer Institute, blending proteomics with genomics and begin to expand it. So one of the things was well can you develop additional efforts and research groups that now have an interest to blend these two avenues. At the same time if you look at different organizations they would be the best to determine what cancers would be most relevant whether it be within their organization in the United States or outside of the United States. Furthermore because CP type has spent a lot of quite frankly time and money developing a lot of these metrics people will begin to adapt them as appropriate. They're not forced to do it because I think that's wrong but if it's appropriate we want to do that. But the part that was really nice was is that everybody would sign what now affects what we refer to as a data sharing pledge. And the pledge is basically a document and it does say it if you wish to partner with our organization the information you produce from this research collaboration will be placed in the public domain. We'll host it at the NCI or you could host it anywhere you want but we want to see it in the public that was the key thing for us. So the very first one that we decided to do was keep it within the US. So right across our hospital in Washington DC happens to be the Naval Hospital and that was the first one that we decided to start the deal with. And that program is now is affection referred to as Apollo. So Apollo involves the National Cancer Institute, the Department of Defense and the Veterans Administration. And Apollo basically takes the CPTEC model and we're rolling across all the VA hospitals and all the military hospitals within the US. The ultimate goal is to be conducting research that begins to blend the existing genomic-based methodologies that's been driving a lot of the patient care with now blending it with the proteomics landscape. Now this one when this got completed I thought my job quite frankly was done. I could give myself a nice pat on the back. I could go to my wife. Hey, you won't believe what I just ended up doing. My daughter would be like, wow, that's amazing what you ended up doing. It turns out it wasn't that easy because when this got done then I get another phone call. And the phone call is, well, we love what you ended up doing here but can you bring outside countries now into this mix? I thought it was an interesting call. So we decided to take the challenge upon us. So in summer now of 2016 we decided to take the program on an international level. Partly because we also had little collaborations with some institutions across the US but we decided to now formalize it across the cancer moonshot activity. So the very first country that signs onto this idea of developing this partnership with the National Cancer Institute becomes Australia. So we had Australia on board and we brought in four institutions. Now, again I thought my job was done. I satisfied a phone call. Nope, it's never that easy it turns out. When you deliver typically people want more. So rule of thumb, you want to under promise and over deliver which is what I've learned because the minute you start delivering everybody expects even above that now. So I ended up then getting that second phone call. The second phone call becomes okay we love what you ended up doing with Australia. And in fact it was great because the former vice president Joe Biden flies to Australia and he does his big opening. Another phone call comes down the road and the question is, well can you bring other countries? And by the way you have eight weeks to do this. I had no idea why eight weeks was relevant. I found out later on that it was going to be announced at the United Nations. But nevertheless I decided to take the challenge again. So in July of 2016 we ended up going from one country four institutions in a span of eight weeks. We expanded this now to eight countries and we brought in 16 institutions. Now this happens to be September of 2016. Obviously now we're in December of what 2018. So the question is whatever came of this program. So to my surprise but to my pleasure I have to admit this has now taken a life on its own. So this now is officially known as ICPC. This is the International Cancer Proteogenome Consortium. This now involves 12 countries spanning 31 institutions collectively all working together on just over a dozen cancer types. Some of these cancer types do overlap but that's actually fine because I'm the first one that I've wanted to know for years. Why for example individuals in the United States that are predominantly going to be of European descent develop breast cancer women. Yet these individuals a lot of it's going to be smoke based you find out. But yet you go to Asia a lot of women really don't smoke and they're developing breast cancer. For me the goal is very simple of ICPC. Ultimately what we want to do is develop a database a resource that now is finally going to be representative of the diversity of individuals along with their cancers across the globe and give all the information back into the public domain. So what has the program done in the past 12 months? Not you know since the time this thing was created. So here's an example of what we've done. So the very first data set because at the time people said oh you'll never get other countries to make the data public. No one's ever going to do that. I have no idea what people say this because if you sign a paper and that's part of what you sign on to it's like a marriage contract to me. Yes you said I will I do and you expect something to happen. And we're not having difficulty thus far. So the very first data set that got released was in September of 2017 by our colleagues in Taiwan. A very unique study that they ended up doing for oral cancer which is very dominant there especially within the rural population because of the peanut nut that happened to chew along with all the components that they add to that. Then at the same time what we did last year we actually held local cancer moonshot workshop as a way of raising greater awareness within the local municipalities. Same thing that is going to be happening here within the cancer moonshot activities raising the awareness that helps then those individuals those universities in those countries raise their own capital funding to launch larger initiatives within their own component. At the same time we actually so one was actually being held in Australia and the other one was done in Sweden. We also welcomed last calendar year officially two or three additional institutions spanning two countries. The very first one was Korea University which joined us in what October of 2017 and of course India joined in May of 2018. And we also launched an international or we piloted a student exchange program from Australia with one of our laboratories based in the U.S. The other thing that we're starting to do is because all this is research based pretty much use only is that we're starting to convert some of these laboratories on an international level to become CLIA certified and these are the targeted based assays. Why would it be advantageous to be CLIA certified? Because that means you could take the information that comes out of your instrument and take it directly back to a TUM award to give it back to a patient. So we're starting to build the infrastructure more and more on an international scope. This is a hugely fun program I have to admit we get together now at least once per year the very last time we got together was in the United States in the state of Florida. As you can see down on the right these are the other times that we've gotten together but now as a sense of pride we all get together and at the meetings we all hold our representative country flag because it's really multiple nations recognizing that cancer is simply not something that's locked to one nation it's an international effort that we need to resolve. So in the last five minutes let me talk about big data because I talked about that we give away all these data sets now first of all I love the terminology big data I have to admit I have no idea what that means anymore and being here in the past couple of days people are talking about precision based measurements what exactly is big that's a very subjective terminology is this big or is this big or is that big but people love to use phrases so I'll use them in my talk but when people talk about genomics within the National Cancer Institute one of the things that we're now we've started to do is we want to make it more easy for people to get access to our data sets. So now what we've done within the NCI is we're developing data commons so in the genomics landscape we've launched just last calendar year what's now known as the genomic data commons this is predominantly the data sets that comes out of the cancer genome atlas the part that's quite nice is that the ultimate goal is everything is based in the cloud you don't even have to download the data packets anymore and a lot of the software tools are all dockerized within the cloud itself so that's the genomics landscape so the question becomes well NCI what else are you doing because there's more than genomics and I remember Rodriguez you just said that you love the proteomics landscape mixing it with the genomics so here's what NCI is now doing the ultimate goal now for the Institute is no longer to have this genomic data commons it's basically to have a cancer research data commons and that's going to involve multiple modalities of the different types of omics obviously the one that I want to point out which is why we're here is we're going to be building a proteomic data commons and you'll be hearing lectures tomorrow exactly how the proteomics information is slowly being rolled into this landscape but if people want to play with it today they did have a soft launch this calendar year here's the website please go to it look at it play with it critique it provide your comments because those are things that we're trying to do this is basically as you have your restaurant it's a soft launch so this is like our alpha launch equivalent but it's out there now and ultimately the goal is all the data sets it's predominantly populated with the CP tech data sets but it all directly links back with DCGA and obviously the goal for us in the future is to have all everything based in the cloud along with the computational capability that we want to put into this but ultimately the goal is simply to have a cancer research data commons now lastly is this one these programs produce a lot of data sets I have to well so what do you do with the data well our investigators we do give grants and they try to analyze the data the best they can the question becomes is have you really taken or extracted all the knowledge out of the data that you possibly could quite frankly I have to admit my answer for many years was of course we've looked at everything you can look at the data sets so about five years ago I ended up meeting an individual named Gustavo Stolovitzky who actually came up with something called dream challenges and he has a he has an academic appointment in New York but he predominantly works for IBM and his comment to me was quite frankly he said you're an idiot I actually liked him when he phrased it because my comment was so what do you say this and he explained to me that he created something called challenges where he basically takes existing data sets that are out there and he challenges then the community just like you guys were doing with these questionnaires can somebody come up with a better way or a better algorithm to go after the information that either you could not extract from it or you did extract but it wasn't as efficient as time moves on with better tools that are now being developed so ultimately the goal that I came up with is this easy cartoon there's a cool little website you can actually take a picture of Einstein and you can literally type in questions that appears that Einstein is writing so for me this is exactly what I used to do when I used to be with a university setting is typically what you do when you run a laboratory is you do your experiments you collect a lot of information you find very nice correlations you try to develop these fancy graphs like volcano plots I saw being talked about yes very attractive you know they're quite complicated I think people just say nod their head yes I understand I don't know if they do but at the end of the day what you want to do is basically publish a paper it's a very effective model because what you want is these individual laboratories which are very elaborate like artisans that's a lot of the creativity that's out there and I love that landscape space but the question becomes is well what do you take that information and you put it on the public and you begin to crowdsource a question so that's what we wanted to explore so the very first one that we ended up doing was about two years ago and it was the very first proteogenomic computational challenge that was crowdsourced we teamed up at this point with with with the dream organization we actually brought in NVIDIA which is the graphical chip manufacturer we partnered with them they gave us some server frames for what they wanted to do but ultimately they also contributed a monetary award to this and we also partnered with nature methods you're not guaranteed to get your findings published but you're pretty much guaranteed to send it out for review which is quite nice but ultimately at the end of the day is we literally thought we would get maybe 50 individuals applying to a challenge to our surprise we got over 500 individuals that applied to the challenge and actually span 20 countries now this challenge predominantly was a biological driven challenge we took our data sets we basically asked simplistic questions so challenge one you could see basically is if we give you DNA and RNA how good are your predictors now determining the abundance if you give you DNA and RNA and abundance how good are your predictors at looking at the phosphorylation and the way that a lot of us now like the phrase it is the good news is you have winners the very good news is is that it turns out the computational tools are still not as good as a physical based measurement so people that are going in proteomics you clearly will have jobs for the years to come which is what I would like to say so but here's the part that really struck me the most is that out of the universities or the institutions that we thought would have won we're not the ones that won at all in fact there were groups that we never heard of within my own program so two of these challenges was won by colleagues at the University of Michigan and another challenge actually got won by a group at korea university now this was a biologically driven challenge this actually caught the attention out of the food and drug administration back in the united states and we decided to do a new challenge with them so now the FDA has launched the very first regulatory proteogenomic based challenge and here it is it's still technically ongoing so this one now basically again it looks at crowdsourcing it no longer involves the dream challenge although it's a partnership with them still but pretty much with the US FDA and what we're doing here is something that technically can happen if you take it in an individual sample and ultimately the sample goes to multiple laboratories one for genomics one for maybe metabolomics or one for proteomics in our case genomics transcriptomic and proteomics what if one of these samples become misplaced or mislabeled and the information comes back so we ask that question and using either genomics or then taking genomics and throwing proteomics how easily are you able to identify and most likely put the information back to identify where the mix up occurred so this is the very first one with the FDA and it's still open if you quickly want to join it but hopefully the goal out of this is that it kind of shows is that when you put the information in the public domain there is other utilities behind it so lastly what I want to say is I hope I hope that I've been able to kind of demonstrate how what we've been doing now at the NCI and now with these moonshot based efforts whether it be in the US or in an international level is that at least for me I'm one of these converts to me I actually come from a genomics background people ever look at my old history but I'm actually convinced is that if people talk about precision medicine or precision oncology for me that's really what I like to define as a team sport it's not genomics in isolation it's not either metabolomics in isolation or it's not proteomics in isolation if these technologies are robust immature and the quantitative and there's the ability to combine them to me that's really what fulfills the underlying story of biology and really could push precision oncology even further so with that I want to thank everyone for your attention I'll be more and be glad to address questions at this point thank you everyone sir though it might be more ambitious what I must do for once you please talk to NCI with the cancer research data comments alone how do you include aging and the diabetes into those okay so if I understand it correctly the question is for the data comments of NCI can you include aging and diabetes so aging is part of it because that's part of the metadata that we want right so all the electronic health records comes along with it but diabetes is a very difficult one because obviously so so the way that I phrase it is if you look at NCI C is cancer so but you know it's interesting because one of things that within our assay portal two years ago people were asking me we would like to deposit our assays into your portal and actually they were coming from the diabetes Institute and I'll be honest my my first reaction was no because ours is oncology but the more that I started to kind of think about it is really almost all biology plays a role in all these diseases so I'd say as things evolve there's always these possibilities so there's a couple of there's a couple of in there yeah so so within the US there's a couple of sites that you're allowed to download certain certain types of information so so but when you apply to it obviously you have to identify what the information is going to be used for because the higher level of criteria so so typically treatment naive information that's more simplistic get your access to but the more you start moving into that space and if it's de-identifying and and if it is made available it's just a higher rigor to get access to it NCI is now also investing in these alternate medicines of like so is there any kind of future thinking of so when you say alternative medicine I'm assuming you're talking more natural products yeah yeah yeah so that one I actually don't know if that sits in the population set I do know NCI does have a big repository of natural products and they're trying to identify more application that it could be applied to quite frankly I don't see why not if that won't be part of it but at least I'm not aware of any activity at the moment oh yeah I'm not from cancer background right but my curiosity to know about this okay it's fine whatever you are doing to treat to get out of this cancer but is there anything like that what kind of people is the most prone to cancer that kind of so the question is let let me make sure that I understand it so yours is are you able to identify individuals that could be more susceptible to the development of cancer so so those are epidemiology-based studies right where you're trying to understand the environment the food all those components I mean those are separate components within the NCI those things do exist as as part of the organization that they go after that is the number one question that I guess how so how can we guys don't do metabolites my god you guys need metabolites so I yeah so I I personally like to simplify things and from what I've seen is that it is already so complex quite frankly just to mix proteomics with genomics and transcriptomics adding another layer of complexity even makes it more complex so what I tend to always want to know is a very simple question you ask the people already that are understanding that disease and then you ask them do you already use metabolites first and foremost to screen something of that individual for that disease of interest so for example GBM right so like brain cancer people develop panels now so we do have a GBM project metabolites is part of the formula but but but but at least what we try to avoid because it just adds additional time and cost is we really want to make certain that you just don't do something for the sake of doing it you have some logic on why you would like to do it but the reality is we are playing with it but not to the scale that we mix DNA, RNA and proteins it's just a side project at the moment in conclusions we understand that how much initiative has already been taken by the NIH to manage multimodal data in the form of repository and global databases the genomics transeptomics proteomics imaging data when come under the same roof and analyze properly could provide many new facets many new unrevealed facts pristine medicine success story could only be written when large number of data set from all omics field are together analyzed thoroughly and then only meaningful conclusions can be drawn though we are generating large amount of data from NGS platforms and mass spectrometry technologies but whether these big data sets are fully analyzed proper QC checks have been performed we have to look into all of this very carefully so let us thank Dr. Rodriguez for his wonderful lecture which has really set up a good stage for this course why there is need to look at new approaches of proteogenomic analysis now we will move on to the modules the very first module on the genomic technologies and the first lecture of that will be given by Dr. Kelly Ruggles next week talking about introduction to genomics thank you