 Welcome to MOOC course on Introduction to Proteogenomics, today we have last lecture of this whole course. In this MOOC course you have been introduced to the concepts of genomics, proteomics and proteogenomics. An effort has been made to help you understand various steps of data generation, analysis and interpretation. Though the field of proteogenomics is still evolving, its contribution to the development of science, particularly in precision medicine cannot be undermined, there are many tools which are currently being used for proteogenomics. I hope you got a good understanding of these tools and publicly available resources which you can also start using for your own research. The National Cancer Institute, USA has constantly made an effort to bring research communities together for fighting the common evil of dreadful disease like cancer. In this last lecture, you will be introduced to the various initiatives of NCI toward development of a cancer free world. This lecture is essentially a brainstorming meet of cancer clinicians, researchers and industry experts which we conducted to mark cancer moonshot India program at IIT Bombay. So let us have this interactive session about cancer moonshot India and a perspective shared by Dr. Henry Rojugus. It is a great honor to be here. I mean, I know about the year and a half ago when I read to Sanjeev and Sanjeev and talked about the program that he was developing here in India, but one of the things I ended up doing was contacting a lot of my colleagues that knew his work and the one thing that they did indicate is that the work that is being produced in his laboratory is exceptionally well. So it is just an honor to now be here knowing that India has joined this international effort that is one of the outshoots of the United States government. So what I thought that I would do is to sort of give simplistic overview of how we ended up doing what we do now within the National Cancer Institute. More specifically, two years ago, how the cancer moonshot actually got established. So let me give my little history here from a simplistic level. People don't know why the National Cancer Institute enjoys these large scale programs. So I actually don't know from a proteomics background. You can know me. You can look it up. I actually come from a proteomics background. And I've classically trained in California and drug development. So one of the things that did attract me to the National Cancer Institute when they recruited me was their history. And the history, you need to understand where genomics fits with proteomics. Though if people talk about genomics, large scale, and you mentioned the word National Cancer Institute, the one everybody's going to recognize is the cancer genome atlas. And that program actually got created in 2006 when it went public. That's the key when the program went public. And the program at a 10-year window has done an incredible job from a cataloging-based perspective to develop these great resources that the public needed to use. In the span of 10 years, they went to about 35 different cancer types looking at a catalog of over 14,000 individuals. But the part that a lot of people might not be familiar with is that when they launched the cancer genome atlas, but when they were trying to develop it, they did not simply want to go after the genes. The National Cancer Institute, where they were formulated this program, they wanted to go after proteins. And at the very same time that we launched the TCGA program, based on the genomics landscape, we went after proteins. And that program is one that affectionately is referred to as the CPTAP. Now, the reason they wanted to go after proteins at the same time as genomics was for two basic reasons, which were talked in the various sessions that people now have been holding. One of them is you absolutely want to figure out the biology of cancer. One of those individuals that I think our market discovery is very great, but unless you understand, even though my biology of the disease, it's very difficult to make a novel discovery that you find, which is an anecdotal observation and making that discovery clinically actionable on a wide scale. That's very difficult and very rarely quite honest. So understanding the biology is extremely important. The other reason that it's very important to understand that proteins exactly what your therapeutics is going after is the key word therapy. While immunology is very promising, the vast majority of drugs that we still get toward patients, they're typically chemical based. And the chemicals, there's very few that target DNA, such as interbinding strands. The main variety of drugs will target a protein. So you really need to understand from those perspectives what's the quality of these proteins and exactly the efficiency and the binding constant of the target you're trying to go after, not in inference, which is typically commonly done. But here's not what happened. Before the cancer genome apparatus got launched, we were starting to think about it in the early 2000s. And that's when the first drop of the immunogenome got created. That really led to this great interest of looking at the molecular biology of cancer. But at the same time, there was a publication that got released looking at ovarian cancer early stage using an emerging technology at the time, which was mass spectrometry. They made a claim that they're able to use omyx without even recognizing the protein and simply recognizing a pattern of an instrument and using that as a predictor for early stage ovarian cancer. Very promising to raise a lot of interest with a lot of cancer directors back in the U.S. But fortunately, it was found that the study, the way that it was designed, the way that things were interpreted were not correct. So the reason what we ended up doing at the end said it was quite interesting. When it came to CPTAC, we did not go after biology in 2000s sense. Unlike the genomics landscape, they felt at the time that technology was premature and you could trust the data. So CPTAC first had to show that you could take these emerging technologies and do your best to try to understand the analytics. Standardize it where you can, if you can't standardize it, try to harmonize the measurements in the analytical workflows. Once you're able to show that you can actually come back with measurements that can be representative of biology, not the measurement that's representative of an artifact, that the way you take a sample, the way you process a sample, the way you do your instruments, then the board would give us permission to work with the biology. So this is what we ended up doing, which is quite interesting because it's very rare to mention the Health Adultery Standards Initiative. But we ended up doing that to bring pretty old mix to the state of genomics. So for the very first five years, CPTAC basically tried to go after the analytics of mass spectrometry and we looked at two folds, which was actually discussed in the past two days. One, we looked at the discovery states. Here we basically showed, if you take a lot of people refer to a shotgun, I quite frankly am not a fan of the terminology shotgun, but I tend to basically refer to a genomics. This is a deep-dive comprehensive measurement of trying to look at everything we can at this level, exactly doing genomics. And in that space, we basically showed if you had standard operating procedures, we did elaborate ground-bound studies, we actually had eight laboratories throughout the U.S. and we did an international run, and we showed you a very good concordance of your measurements across your laboratories. But the one that we really wanted to put our mark on is the one that exactly we do in genomics. Once you do a deep-dive, they need to develop gene panels. These gene panels are actually the drives that are going with trials today. So we wanted to develop that same space when it came to proteomics. Now it turns out this measurement technology that a lot of people refer to as today as multiple reaction monitoring, you have different ways of praising it, but this is not something that CPTAC never developed like any such imagination. It's been used in clinical lab for over 30 years. We're simply using it for the measurement of small molecules. What we wanted to show is what other laboratories were already using is could the measurement of target and aspect be applied to the measurement amount of a peptide. Informatically sticks the information back up to the measurement of a protein. So what we ended up doing there, we basically looked at MRM, which was already being used in laboratories, but it's really due diligence on the accuracy and precision across multiple labs was yet not demonstrated. It's very important to have that proof before you go back to a tumor point. So we basically, again, we developed a Rob Robin study. We had laboratories that distributed around the West. Then we did an international Rob Robin study and we showed that this is a very good, quantitative, reproducible measurement tool. The other thing that we wanted to do at the time was explore the clinical space if you find an interesting biology and if the biology is best measured using these technologies, what would it take to get the technology approved by the regulatory agency? So what we ended up doing is one of the things that we did was quite nice that in the US to get a diagnostic device approved such as an IDMIA, you need to get regulatory clearance. And those, and there's two types. The first we need to go after is what's referred to as the 510 today. So we actually worked with the regulatory agency in the United States. We worked with the clinical chemistry community. What we ended up doing was quite novel. Typically, manufacturers will submit 510K to the regulatory agency. And then the FDA will mark up the document putting all their comments and concerns on what they just submitted. But typically, when that goes back to a company, they'll never read it to the public. We decided we wanted to make that very transparent. So we worked with them. We put out a workshop and we actually submitted an official filing to the regulatory agency using this type of measurement technique. But because we made up all the data but we did not make up our other ones for workflows, it allowed the FDA to mark up the document. And then once we got the documents back, because we worked with the clinical chemistry community, we published all their markings on it. So it's a great way of making very transparent exactly the kind of questions you would give if you were just to make your instrument these measurement techniques to get them approved by the FDA. The other stuff we recognized is that a lot of the reagents were being commercially sold. We felt that the quality was not the level of standards that we wanted to see these within the research and ultimately within the clinical grade world. So we worked with various manufacturers to kind of elevate the standards when they sell these reagents to the public domain. The other one was we started going to meetings and people were always saying, I have an asset, I have an asset. In fact, one meeting that I went to and said quite openly is that one person stood up and they said, I already got an asset that every human protein that's out there. I was quite surprised when I heard that. After the meeting, I had basically approached this individual and said, explain to me how to develop an asset to every human protein. What turns out, what they were talking about was a theoretical-based asset where an asset is basically running a buffer. In a clinic, that's not considered an asset in using that terminology. So what we decided to do within CDTAC we basically then started developing fit-for-purpose-based criteria that begins to define exactly what an asset is. What's nice now is that that has not been accepted by the international community or specifically one of the prominent journals of MCP. Now they've adopted those criteria within the journal itself. And this was also done in pharmaceutical industry with the Clinic of Northwest Laboratories and the regulatory agency and the clinical labs in the United States to develop these animal sort of criteria. So this actually now represents five years' worth of history for the CDTAC-based effort. Again, we had to go back and we had to show that the measurement you're able to obtain is basically trustworthy. You can actually believe in the measurement of being a representative of biology. Once we did that, we went back to our board and then we got the issue. And what we decided to do was quite interesting because obviously at the NCI, we had the cancer genome appels. That now has five years' worth of history of our CDTAC. And they're generating a lot of interesting information and what our proposal to them was we want to take the exact tour that just got genomically sequenced within the cancer genome appels program. And at the same time then, we would actually now go after the proteins within that sample. And we would believe that a year to layer a comprehensive protein measurement above a comprehensive genomics measurement, you're able to obtain additional biology that is either difficult to obtain or simply not feasible through genomics and soft. Now think about that. Because at the end of the day, and I saw it today, a lot of people said, well, in proteomics, that's always much better than genomics. The reality is, when I couldn't move into a clinic, two things is just going to drive your decision. And that is, is the test clinically relevant for the disease you're trying to go after. And the other thing they're going to ask is, well, how much is the test and what's your throughput? Because the reality is, if transcriptomics is able to predict the same thing as proteomics can, somebody at the hospital is going to say, well, why do I need to measure my proteins? Because it's lower throughput and it's a higher cost. So that was the cable that we took. Could we find additional biology? So in the next five years, we decided to go after three cancer types after the cancer genomics. We would have breast cancer, ovarian cancer, and colorectal cancer. I'm not going to go through all the details, but suffice to say, in each one of these cancer types, we were able to identify additional biology that was missed simply because we can't obtain it from genomics itself, or it's just a better way of integrating the data set between you, at least within the genomics and within the proteomics landscape. Furthermore, what we learned from this lesson was that if you simply go after one type of an omen, whether it be genomics or transcriptomics or proteomics, most likely you're going to be missing key biology that could be inferred from one of those other omen. So integrating those worlds would become very important for our program. So, with that in mind, I heard that question today, so I quickly then put this next slide in. One of the things that I started to be and that people were asking me, in fact before I ended up going back to my board to come up with the next version of CTTAC was the exact question that somebody just asked at today's conference. And that is, well, you ended up doing proteomics and you threw it above genomics, and you guys found all this additional biology, why don't you do proteomics then? Because it's much better. Well, my philosophy has been should you do genomics and proteomics and which one's better? Well, one of you, no one really knows the answer yet. And here's why. I'll break it down into two components for you. One, let's look at biology itself because that's the part that I tend to love the most. If you look at the cancer genome atlas because the TCGA would be there for 10 years, it really was a clinically actually deployable yet. You're basically trying to figure out biology because those samples were not collected with a clinical question yet in mind. It was basically cataloging samples. TCGA, as I said, in a 10-year window they went after 34 cancer types and they basically genomically characterized over 14,000 individuals. In the process they basically identified all these actionable mutations where now we have small molecules and all these small molecules are driving a lot of our precision oncology trials. So that's the good news. Now you can look at the other side of the story which is what we're learning about three to four years now into this sort of science and driving our clinical trial. More and more we're learning that a lot of these individuals that we identified all these actionable mutations a lot of them really are not responding that well to the treatment that they're being administered within our treatment arms of whatever clinical trial that they're being deployed on. Furthermore, if they do respond we're finding out that those responses are actually short-term. This includes toxicity that could occur that once you get there you have to put them on another arm when you're in that trial itself. So the question becomes what? We don't know. But what we do know with absolute certainty is that there's still a tremendous amount of biology that's missing from that picture. Now this is the biological version. Let's flip it to a clinical way of thinking. A great little paper came out about three years ago by Tito Bulbor who used to be at the cancer institute and I moved to New York City. So Tito, when he ended up doing it, was quite interesting. He basically said, let me actually go through the regulatory agency back in the United States and ask the question about a 12-year window when people start using the term precision medicine, precision oncology. The reality is you're talking targeted therapy and if you look at all the targeted therapy drugs that have now been approved by the FDA there's just about over seven. And your window spans about 12 to 15 years and what he ended up doing was he took two common criteria that's used all the time in our trials and that is when you look at the individuals you ask, well, what's going to be their overall survival? And at the same time, what's the progression of free survival of these individuals? He excluded the exceptional responders which is what a lot of people love to go after and again to be very fair this is solid and typically mid-stage and what he found out was for all those drugs that have been approved from a targeted perspective on average those two criteria it's less than three months so that's really not that good again, it's very promising on oncology but it's still we could do a lot better so using these two criteria the argument of missing biology and then the argument of can you begin to play in the sandbox of clinical trials that influence directly the next iteration of the CX program so this is actually CPTAC so this is actually CPTAC today again, these are five-year programs and the next iteration we know that that will be contingent on the science that comes out of these programs so we still have a biological law that's exactly what we did within the next component we're being held responsible to go after at least five more cancer types hopefully more but at the minimum we have to go after five samples that we get from our patients these are all treatment IH samples every sample goes and we partner directly with the cancer genome athletes they will do comprehensive genomic characterization then the sample of these are also goals to our characterization centers along with our data analysis centers at the same time that we run a biological arm we have now an official translation arm for the very first time the Cancer Institute is now partnering with the ongoing proteomic laboratory with an NCI-sponsored clinical product we have three types of cancer types that we're going after for that component that involves a series of drug trials but again the part that I think that's quite nice about CPTAC is that the data we got born about 12 years ago now everything that we produce we put it in the public domain which is listed on the bottom of the slide everything from genomic information to proteomic information and the agent that we develop which are typically antibodies any of the assays, all the SOPs that we produce our assays against all that is placed in the public domain the argument is is that we know it's suddenly being used by the community and we believe by giving it back to the people it drives the science and hopefully patient care you're able to accelerate it not just within your country but across the globe because that's our goal ultimately and so once this program got launched here's something now here's Melvin Star that happened in the Cancer Institute so CPTAC while it was one of the first ones that started to mix these two worlds together from a programmatic official level it's not the only one that's starting to do that now there's two other ones that recently got launched just over two years ago because of the cancer moonshot one of them is referred to as the Apollo program and the other one is referred to as ICPC to the International Cancer Reporting General Consortium so very simplistically the Apollo program is one so these programs actually got started as follows I think a lot of us and especially myself I was extremely moved in the early 2000s by the inspiration of Paul of the former vice president of our country Joe Biden when he actually launched the cancer moonshot now you can go through all the details of what the cancer moonshot is what I enjoyed the most was I tried to simplify it into three simple objectives that we want to achieve from that one of them happens to be we want to accelerate the progress in cancer research there's many ways that you can do that you can do technology development and other components but the other two were the ones that I've always had my parts in which is what CPTAC has been doing for the past 10 years one is you wanted to see greater cooperation but here the way that this was phrased it was not within your own university it was not within your own country they were hoping to try to explore international collaboration and the third one is the one that I was very happy with they wanted to see a lot more sharing of your data the reality you can look at the genomics landscape a lot of the information that you're developing is pretty much pre-competitive because they're basically observations and a lot of those observations it's not yet a clinical relevance behind them so releasing the data is not detrimental in fact it's actually beneficial because the other individuals are able to take your data sets you can recognize more but again you're able to drive science a lot further so using that as a backdrop the very first one that we wanted the pilot taking the CPTAC model and trying to bring other organizations into it became one that involved the US federal government specifically three of our agencies and that program now is referred to as Apollo so Apollo is one right across my hospital has to be one of the largest hospitals for the Department of Defense which is the birth of the cancer center in fact that's the one that you see a lot of the presidential helicopters and congress and other representatives get treated at but that said though we basically walked across our street we met the cancer center director there was an opportunity to begin the pair up the National Cancer Institute the Department of Defense and the Veterans Administration and the deal was very simple they would begin to adopt a lot of the metrics a lot of the standards developed by CPTAC and they would also begin to implement this sort of proteogenomic based approach to look at the science of their veterans and of their family members and the part that's quite nice again is that we signed this partnership predominantly one main criteria is that all the data that they would produce would be placed in the public domain so that's what that involves the US federal government then the next thing I find out what I'm called by the White House by representatives of the cancer room shop program they said so we kind of like we ended up doing from this federal government perspective it's your opportunity to basically make this on an international level I started to think about it and the deal was quite appealing I have to admit it so this is what we ended up doing so we started to ask myself the following what if you can actually take this proteogenomic model and begin to scale it on an international level if you were to do that then you would have each country which is the best in making these decisions along with their various representative government they would be in the best position would be of most significance to their own nation furthermore they would adopt a lot of the metrics and standards if applicable developed by the US city type program but again a part that becomes very critical for me happens is that of any data that you produce from any of these official partnerships regardless of where you're from what country we want to see the data be placed in the public domain we can hope that in your country or at the same time the United States we would post those data sets for you that becomes a key criteria we'll decide on partnerships so that said here's now what transpires so in early 2016 in mid-summer so in January the cancer moonshine was launched then we quickly launched in polo and then I get this call we would like to see the scale on an international level the very first country that signs on in July is one so at that point we're in Australia so we have four institutions now within Australia that hold this partnership I thought my job was done I could give myself 20 nice hats on the back I got one I'm down at the White House I basically go home and tell my daughter oh look what great stuff I'm doing next thing I get another phone call a week later we loved what you did is you're an opportunity to expand this and by the way we want to see it expanded in 8 weeks no idea why 8 weeks was important it turned out the reason it was I found out later on because something was happening in New York City at the United Nations but they weren't telling me at the time but I felt it now would be something very interesting to go after after all it's very rare that you ever get to work for leaders within your own country so I thought it would be interesting so we did so in the span of 8 weeks we went from one country to four institutions at this point now we're spanning 8 countries and now it's involving 16 institutions pretty impressive but it's amazing when you send an email and it says on behalf of the Vice President of our country there's something we would love to see happen it's amazing because my name doesn't carry much weight other people names do now this is September of 2016 keep that in mind right now we're pretty much at the end of 2018 so the question is whatever happened to this program so this program actually has taken a beautiful life on its own today now this program has an official name it's not referred to as ICPC this program now involves 12 countries that spans 31 institutions collectively they're going after 13 cancer types they're not all different cancer types some do overlap but that's fine because the dream the vision that I've always had for this is ultimately when the US produced their database to me ultimately to understand cancer you really want to make it representative of the diversity of individuals and of the diversity of their cancer themselves it's that culmination I think that you're able to better understand the disease on a global scale so what has the program done in the past 12 months so here you go so in the last 12 months alone these are some of the activities that the program achieved the very first dataset that we released to the public actually comes out of Taiwan it was a cool little study that we did on global cancer because of the people that they happened to choose back that put into public domain we also welcomed at that point two other or three institutions spanning two countries early in the Canada year we brought in Korea University and of course as Sanjeeva mentioned the other country that we brought in born was Indiana in May of 2018 we also held a series of local cancer round table sessions basically raising awareness within each country and within their governments that helped them raise money to do the research which is one of the good things that we're doing here today I don't think we should be having a lot more of these to raise more awareness than the funding the other thing we ended up doing is we actually launched a training program of students we piloted that at this point Australia Korea University and the other thing that we're starting to do which is quite nice is that we're starting to take some of our laboratories and we're starting to convert them to become clear certified so they can take the actual test when they develop a targeted based acidity and take it directly back to their two networks and potentially begin to further fuse together the genomic panels along with proteomic panels in influencing how best to actually treat the individual itself the last time we all got together it was in the United States in Orlando Florida just took place a couple of months ago and as you can see it's a great family event I have to say a big tradition now that we started each represented from each country we all hold flags as a sense of pride but again it basically shows one thing that I've learned from this is that two and a half years ago when we thought about this idea and then two years ago when we actually launched it everyone would say you'll never get the data in place in the public domain it's happening we're starting to release it and there's other cancer data that says that we're released within the next six months once those manuscripts get accepted there's really no barrier which is what I'm learning if you simply ask and you actually are very cautious and you're very clear on what you expect things can happen so let me leave you with this final thought so I think a lot of farmers have been made when it comes to genealogy and I'm the first one to admit that because I see how it definitely benefits patients however here's the reality of the statistics today within the United States on average on a yearly basis just shy of two million individuals will hear the words you never want to hear and I tell everyone that's in a research lab go spend or train in a cancer hospital because you really understand the impact of what patients go through when they're retreating and the ones that cannot survive from the disease itself they will hear the words that you've been diagnosed with cancer furthermore on average within the United States just over half million individuals pass away from one of its many diseases and it's not just one disease it's a series of diseases that defines cancer but this is not an issue for the United States this is why we develop ICPC and I think it's great that we have India on board this is a global issue on a global basis itself on average on a yearly basis 14 million individuals are diagnosed with cancer and these are the ones that we're able to report furthermore just shy of 8.5 million individuals will die again from its many diseases so while I think a lot of progress has been made I think there's a tremendous amount of work where we still need to go forward with and quite frankly I think the work that people are doing here and the idea of combining what we've learned from genomics and fusing it now with the measurement of proteomics and in the future I think fusing it even with metabolomics I think that's the key where technology has become very mature and there's an opportunity to combine them that's the opportunity to take because you're able to get more biology out of the disease itself so with that I want to say thank you very much in today's overview session you are provided the knowledge of the various programs run by National Cancer Institute in the United States it was clear from Dr. Henry Rodriguez lecture and discussion that genomics and proteomics are complementary and they're of course indispensable for the understanding of disease pathobiology you are also introduced to the importance of generating high quality data and the various efforts the CPTAC undertook to make proteomics more reliable among the research community the cancer moonshot program aims at collaborating with international labs to gather comprehensive proteogenomic information of various cancers India has recently joined this initiative and now we have become the 12th country to participate in ICPC International Cancer Proteogenome Consortium to specifically look for breast, cervical and oral cancer we are sure that you'll be able to appreciate the importance of international collaborations data sharing and proper quality controls if we are to understand the disease biology and find drug targets against cancer in the future the field of proteogenomics is still emerging and every day new software and new tools are being used it is not possible to cover all of them in this course however we hope that with this course we are able to lay a foundation and instill in you the enthusiasm needed to take proteogenomics research forward thank you and all the best