 Welcome to MOOC course on Introduction to Proteogenomics. What is more powerful genomics or proteomics? I think this has been a long debate. What is more robust, more powerful proteomics based investigations or genomics based information? However, today's distinguished scientist Dr. Henry Rodriguez is going to provide you a new answer that a field proteogenomics which is now emerging can provide us much more meaningful and more powerful information. Dr. Henry Rodriguez is Director of Office of Cancer Clinical Proteomics Research at National Cancer Institute, NCI, National Institutes of Health in USA. Dr. Rodriguez research has focused on understanding mechanisms of cancer and age-related diseases including development of molecular-based technologies in basic, translational and clinical sciences. Dr. Henry Rodriguez has led to the development of NCI's clinical proteomic and proteogenomic research programs. This research today includes the world's largest public repository of proteogenomic sequence data and targeted fit-for-purpose assays. His efforts has led to the formation of two cancer moonshot initiatives, the International Cancer Proteogenome Consortium, ICPC and the Applied Proteogenomics Organizational Learning and Outcomes Apollo Network which he developed and co-developed. Dr. Rodriguez has been very supportive to also include India as a part of ICPC initiative and India has now become the 12th country to join this consortium to look at the cancer proteogenomics research for cervical, breast and oral cancer. Dr. Henry Rodriguez will give an overview talk of clinical proteomic tumor analysis consortium CPTAG which is one of the efforts from NCI to accelerate the understanding of molecular basis of cancer through the applications of large scale genome, proteome or proteogenomic analysis. He will also brief about how NCI is working and taking the translational cancer research to the next step. Dr. Rodriguez will talk to us about how genomics and proteomics together in the area of proteogenomics could make much more meaningful impact. The importance of proteogenomics and how the robust field can reveal answer to different biological questions will be addressed. He will then bring various facts that laboratories worldwide should follow a standardized workflow to obtain repeatable data sets. Dr. Rodriguez will also talk about how proteogenomics is providing new prospects in recent projects of CPTAG like ovarian cancer. So, let us welcome our distinguished colleague Dr. Henry Rodriguez for his lecture. So welcome everyone. So my name is Henry Rodriguez. I am the director for the National Cancer Institute's Office of Cancer Clinical Proteomics Research and after a minute it has been extremely exciting and flattering watching over the past several days this idea of looking at the proteomics based information and trying to now blend it more and more with the rich history that has come out over the past 15 years in the genomics landscape. So what I thought that I would do is to give you sort of an overview of what we have been doing now at the National Cancer Institute really for about 12 years where we see it going in the future for about another 10 years and at the same time kind of talk about how we have taken what we have developed in the US through this program called CPTAG and now we have sort of expanded it and it is really nice to see how India has become the latest partner within this international effort. So let me do this. The first slide that I am going to do is sort of a cartoon because it was this one presentation that I saw two days ago and people were quite nice and they were very scientific okay and they explained to you genotype, they explained to you the genes, then they talked about phenotype but here is my simplistic perspective of trying to understand a genotype and how it rolls up ultimately to a phenotype which is what you want to get your hands on. So imagine if you are at the gym. So in a way if you want to look at what genotype is which is going to be representative of your genomes this in way could actually be your genome which is your genotype which kind of tries to represent it is your blueprint and obviously when you have a blueprint what you are trying to do is to say this is what I could potentially could become that is your genotype all the potential is there but as people know we all aspire to do certain things and sometimes those things actually do not come to fruition. So the reality is your genotype which is what you wish to become as in this individual which is our Schwarzenegger the phenotype which is actually your functional space and you can kind of look at it as your proteome this could actually become your phenotype. So not always do you get what you want to put it in a good way however though that is actually okay because I think ultimately to understand the different states between the genotype and the phenotype it becomes really important to begin to blend these worlds together quite frankly I think if you study only the genome and you ignore the proteome or quite if you look at the proteome completely ignore the genome you are going to be missing a tremendous amount of biology and I hope in the next 40 minutes I could give you an example on how now we have seen that in the space of oncology that that is the case so more and more as technologies are becoming very mature we are starting to blend these worlds together. So this is the history sort of the genome that I see it from the perspective of the National Cancer Institute so I actually got recruited to NCI just about 12 years ago and one of the things that I kind of liked about it is that I love organizations that enjoy taking risks so for very conservative organization they kind of did take a risk politically because it did cost a lot of finances in the space of omics based research and if anyone talks about genomics a lot of people will talk about the cancer genome atlas so the cancer genome atlas actually gets officially launched in 2006 so the dates now become very important here and TCGA in a span of 10 years of course they had a lot of capital to do this but in a span of 10 years they did an amazing thing they basically cataloged about 34 different cancer types these are all solid tumors and they actually went through about just a little bit over 14,000 individuals to achieve that goal and all the information they placed it in the public domain so that's good here's the part a lot of people don't know about the history of NCI actually when they were actually trying to come up with this idea well do we look at the genomics they all along did not want to do genomics in isolation they actually did want to go after the proteomics space and in that and actually what they ended up doing was at the same time that they launched the cancer genome atlas which was mandated to go after biology they launched a proteomic based effort a lot of people knew about it but that program at the time which is now today kind of known as is referred to as CPTAP now the reason they want to do it was quite simplistic in the early 2000s the first draft of the human genome project gets released again it's a draft but that really raised the interest of a lot of oncologists in the US especially our cancer center directors and they basically did these series of workshops and one of the things that came out of these workshops they said we need to now begin to explore omic based technologies in cataloging different cancer types and they made it very clear we want to go after genomics and proteomics now why proteomics or the first one you kind of would actually understand which is what was talked about in the days prior was you need to get an understanding of the underlying biology of that disease and if you talk about biology trying to understand the different pathways and not just taking your RNA seek data and computationally predicting from bioinformatics perspective what the abundance are proteins or more specifically what those modifications would be it's never a one-to-one correlation so they knew they had to understand the underlying biology before any of the biology can even move potentially towards patient care the other reason was is exactly what I said patient care if you ignore the space of IO which is immuno oncology the vast majority of all our patients are still being treated with compounds that are typically chemo based and those compounds actually don't target the fact the vast majority of them don't target DNA there's very few that play this idea of intercalating the DNA the vast majority will always go after a protein so they need to understand not just hey my target binds here but again trying to understand off-target sites and all the wiring of the biology and all the off-roads that you could get but here's now what happens around 2003 using the instrument of a mass spectrometer a publication gets released into the public and that actually looks at early stage ovarian cancer they actually did not identify the proteins that they were measuring they basically looked at these pattern recognitions and based upon that they basically argued that simply simply looking at proteins by a pattern ignoring the genomics information we're able to identify early stage ovarian cancer and I think they talked about like 90% specificity with a 99% of specificity built in which is incredible if you think about it because not even that existed within the DNA diagnostic space well it turned out it was too good to be true there were errors at all levels of this so what the NCI decided to do was when it came to proteomics they did not move forward in 2006 when they created this to go after biology that was taken off the table so they basically wanted to go after the analytics and determine can you standardize these powerful next-generation methodologies predominantly a mass spectrometer and if you can then you would come back to our board you would give us the confidence that we could trust the measurements in other words what you're measuring it's going to be represented the biology trying to go after it's not going to be attributed to an artifact attributed to the way the sample is collected or the way you're processing your sample okay because that's very important everybody's going to measure something but you got to ask yourself is what you're measuring going to be represent of the biology of a disease state or is it an artifact because if it's an artifact it won't go towards patient care most likely and then if you could do that you could go after biology so for the very first five years we had to try to standardize as much as we can I'm not going to go go through all the science but here's what we ended up doing we basically carved the space of proteomics exactly like you do in genomics and genomics you first do a comprehensive characterization once you identify what you want then you basically develop targeted panels so targeted panels that today were exactly dries a lot of our patient care especially within the clinical trial space so when it came to proteomics we decided to take a very similar based approach if you do a deep dive that's basically a lot of people refer to a shotgun I'm not a fan of that terminology so basically I tend to call it very deep comprehensive coverage you're trying to measure as many things as you can and there we basically showed if you distributed a standard operating procedure amongst laboratories guess what you get very good concordance type CVs typically less than 15% and sometimes even less than 10% which is very good then we wanted to explore the space of targeted mass spectrometry because once you identify the large landscape you don't want to do this comprehensive based approach all the time it's very high cost it's low throughput and it requires a lot of sample so you want to get something that's going to be very locked down for lockdown you typically want to target based asset so in that space we basically at the time looked at what's now referred to as multiple reaction monitoring and they have different ways of phrasing this never invented it this existed in clinical labs for 30 years they basically use it for measurement of small molecules we basically want to ask the question if you roll it up to a peptide can you use it in that space and is it reproducible and more importantly can you transfer the technology across laboratories and get very good type measurements so we ended up doing we basically looked at multiple reaction monitoring we did a series of Rob Robin studies one of all date laboratories in the US we got very good results another one then we did an international study labs on the east coast west coast of US and we had a lab in Asia again very good results that that we obtained from that people was talking about skyline a couple of days ago so skyline is actually a little product that came from one of our laboratories when we were actually creating this program it's a great little tool and it shows you how from basic science you could get a computational product that's now is being used broadly by the research community do you think we started to ask is well what if what if you could take your technology and you could potentially move it a little bit further towards regulatory approval because ultimately that's the goal you want to put it in the clinical laboratory and hopefully use the information to go back towards patient care in the US that's the food and drug administration so typically they get a device cleared as an IBD MIA you need to go to the FDA and there's two stages behind it a very first one is what we refer to as a 510k document what happened in the past was typically a manufacturer will submit it it gets all marked up by the food and drug administration and then they give it back to the submitter but the submitter never wants to release it we were very interested in releasing all that information to the public so we ended up doing was we held a workshop with the regulatory agency and we basically made up all the data but we did not make up the analytical workflows the beauty of that was it allowed us then to submit all our data to the regulatory agency as an official filing they marked it up like they would for any device manufacturer but because we submitted it we made up the data then we're able to take the document and we published it in the public domain we actually got it published in it in in clinical chemistry because we partnered with the American Association for Clinical Chemistry in the United States the stuff we realized early on a lot of the commercial grade reagents that are out there in the community were not to the standards we felt they should have been so we've worked with the commercial sector trying to raise the sort of quality of the products that they released and then the other one was a lot of people talk about I've developed a targeted based assay I'll be honest after a while a lot of us did not know what that even meant because people develop assays and you find out what they mean is that they've either develop a theoretical assay or they develop the assay running it in buffer the last I checked if you draw blood or a tissue from a patient it's not theoretical and there's no buffer flowing in that system so we wanted to develop a clinical based way of thinking about it so basically it's a fit for purpose based criteria and we actually did that and was quite nice about it in a very simplistic manner you could kind of see it as the following we developed tiers tier one tier two tier three tier one is basically a clinical grade assay we don't do that within our program tier three there's less analytical rigor involved in that when you have to submit these sorts of criterias but tier two is a nice little sweet spot that everything within the CP tech program we actually adhere what's quite nice is that this ultimately now got picked up by the molecular cellular proteomics as a journal and also by the international community so anytime you now submit to this journal and you say that you've developed a targeted based assay you will have to adhere and describe your assay based on one of these analytical tiers so with this and now now with this this is a five-year window at this point we go back to the board of NCI we could we actually demonstrated that we're able to get very good analytical understanding of these technology predominantly mass spectrometry and now we get approved to move it to the next stage and that next stage was interesting we wanted to explore as a pilot to go after biology the biology we wanted to go after was specifically the cancer genome atlas because that started biology five years before us were five years behind and the way we basically phrased it to the board was we want the exact tumor that just went from a patient to the cancer genome atlas and was comprehensively characterized and we'll take it and then we're gonna put a comprehensive proteomic characterization right above it and ultimately what we're trying to find out is are you're able to identify additional biology that is either difficult to obtain or simply not feasible through genomics so think about it because if what you come out of that sort of a finding is I could confirm what my genomic colleagues just found it's gonna be very difficult to convince people proteomic has a role because proteomics costs more in its lower throughput and does require a higher amount of sample input so that was the goal can you find additional biology pure and simple so here's kind of what we ended up doing we went after three cancer types of TCGA we went after colorectal cancer ovarian cancer and breast cancer on average about a hundred individuals for each one of our studies suffice to say here's the overall highlights in every one of these we found new biology now here's sort of a little example of what I mean by finding additional biology if you look at the ovarian study this is just one little slide that comes out of that paper so in the cancer genome atlas we actually catalog just shy of 500 patients to come up with us with the observations that we did for ovarian cancer and in that they did a whole series of analytical different ways of looking at the data sets so what our investigators had an interest in is if you look at the proteomics landscape are able to to tease out two features that's associated with ovarian cancer once going to be overall survival typically wanted to find out if you could separate short versus long term less than three more than five years and at the same time they were they were interested in homologous recombination deficiency or brockeness as it's commonly referred to us so what they ended up doing was the following out of the 500 we took approximately shy of 200 of those samples and we distributed to two laboratories they were blinded to what the samples were and they perform a whole series of bioinformatics on the information one of the things they did was a consensus clustering kind of analogous to what's done at the RNA-Seq level and the question is if you look at the information at the protein landscape when I'm looking at protein abundance what do you get is it going to be different or we simply confirm what you did at the transcriptomic level so here's what they found out not only do you confirm but you're also to infer additional biology so out of the four initial subtypes that you get at the transcriptomic level they nicely roll up to the protein level but in addition to that they identified a additional subtype that's identified here this one they simply referred to as straw wool because a lot of these proteins tend to be associated with things like angiogenesis but again the key of this study that they had an interest in when they got these samples they wanted to identify can the protein information at abundance level separate out for me either overall survival or HRD status and it actually turned out the answer was no so protein abundance in itself in this type of an analysis could not separate overall survival or HRD status but that actually wasn't bad and here's why because the same type of analysis was performed by TCGA either in their flagship study or an additional study down the road that TCGA did and they also could not identify those two criterias now here's why it gets interesting so these investigators they had an advantage they had genomics based data from TCGA and we had protein information they also had modified proteins at the same time and as opposed to be asking these questions and trying to analyze the information from a gene-based level way of thinking they wanted to roll up the information into biological pathways so what they ended up doing was they took all the data and then they plugged it into the inside pathway interaction database and they identified just over 200 signaling pathways now just focusing on the feature of overall survival they asked a simple question can I use the information now looking at cellular pathways and try to separate short versus long-term survival so here's what they get looking at protein abundance well it turns out five pathways all send rise up to the top from those just over 200 if you normalize against abundance and now look at phosphorylation an additional five pathways became apparent there's a nice crosstalk between one of these growth factor receptor pathways but because we also had TCGA data from the same tumor we also analyzed it at the RNA-seq level a different pathway came up now you could begin to see what started happening to our program in other words if you were to perform an experiment either looking at only protein abundance and you're done or you want to look at phosphorylation and nothing else or you just want to look at genomics most likely you're going to be looking at an incomplete picture of the underlying biology for this study that we were involved in so that became sort of a very turning point for us so at the end of the day what we learned from this was if you have the opportunity as these technologies are now mature and you begin to actually perform comprehensive genomics with proteomics at the same time most likely blending these worlds together is going to give you a better understanding not only of the underlying biology of the disease but we hope we hope that the biology could potentially translate towards patient care now in addition also to be developing very detailed pathway based maps as are shown here you could also begin to tease out those funky features that people like to look at so for example this looks at the what is it for the growth of factor receptor pathway a very commonly looked at transcription factor turns out to be stat 5a and actually by rolling both informations together if it's not that obvious here basically what we showed was at the transcriptomic level looking at overall survival not much change at the protein abundance level it kind of resembled at that point what you found at the transcriptomic maybe a slight little bump and nothing really that's going to be significant but really you saw a huge increase at the level of of phosphorylation so again three cancer types that we did all similar observations so here's now what starts to happen and I've seen this question being asked also in the past couple of days so now we have standardization first five years the next five years which we just wrapped up focus on trying to tease out biology and we had to go back to our board and what people kept on asking is the same thing that people were asking for for the past two days and that's the following wow so which one's going to be better should I only do genomics or should I only do proteomics should I do proteomics a completely different genomics which one's better between the two so the way that I kind of viewed it was just take yourself back to a book of biochemistry the first thing you learn is that everything has to relate to one another and if you get a good comprehensive system perspective view of that biology hopefully it's going to be more representative of the disease state itself so for us the answer became no I seriously doubt if you don't understand any of the biology of what you're going after why do you want to go after one of these omics now when the technologies have become quite mature and here's why which is what's the same argument that I made to the board about four years ago if you look at the cancer genome atlas again the cancer genome atlas I'm a huge fan of this program simply for what it was able to achieve in a 10 year window they went after 34 cancer types just over 14 000 individuals and in the process they found a lot of interesting biology again you can't put clinical context behind this because the samples are never collected with a clinical question in mind but nevertheless a very good resource that's been given to the public at large but in that they also identified a whole series of actionable mutations then now some of our small molecules it's actually driving a lot of our precision oncology trials so that's the good news now you can actually look at the other side of your story which is what we're learning now four years down the road and running a lot of these very precision oncology trials what we're learning is that a lot of these tumors that they had these actionable mutations that we develop all these gmp facilities to develop these small molecules those individuals actually are really not responding long term to the therapy that they're being administered if they do respond it's short-term and a lot of them actually develop toxicity they got to take it from one treatment arm and then quickly move them into another why we have no idea why that's the bottom why so for me what that tells me is that there's still a tremendous amount of missing biology strictly focusing on a one omic-based approach now you can actually flip the coin just look at what's going on within the therapeutic perspective so this is a nice little paper that people could look it up it's by a colleague named Tito Fo who used to be at the NCI and now moved to New York City but he did basically this little analysis where he looked at solid tumors and what Tito did was actually quite savvy he went in the public domain he said look if you look at the first main precision oncology drug that came out which is Gleevec along with Herceptin in the early 2000s and what's transpired over the past 15 years there's just there's about over 70 these drugs now and if you look at the drugs for all the different cancer types that that that they're being used either as a single or in combination on average just on average what's the two main criteria that people look at either overall survival or progression free survival now you excluded from this study obviously the exceptional responders and what you found out is for all these therapies on average for both of these different metrics it's typically no more than three months so this played a big role the way that CPTAC now evolved in in in its current round we still go after biology like we did when the prior program but now we're slowly trying to move into that translational space so this is CPTAC today so CPTAC is still held responsible to characterize deep comprehensive genomic characterization along with proteomic characterization for five additional cancer types and all the information we put in in the public domain because we see it as pre-competitive at the same time for the very first time the National Cancer Institute has now partnered a proteomics laboratory with an ongoing precision oncology typically genomically driven NCI sponsored clinical trial now what's interesting there is that the information is not going to go back to a tumor board to figure out exactly what treatment arm or what therapy to administer to a patient on the other hand the information is basically going to be used in a reverse engineering manner so based on the study itself you'll be able to get samples from these trials which are very well controlled to the amount of clinical inference you're able to pull out of it is tremendous and you'll get pre-during post and hopefully what we hope to learn from that program is if the individual did not respond to the way we think they should have responded based on the genomic information can we identify the biology to the root cause of that by looking at the protein landscape of those subjects and if that turns out to be very revealing my goal is that in the next iteration we want to combine those two worlds fusion together and actually go directly now to our tumor boards now in terms of how old is the current CPTAC program actually it's not that old so in terms of the comprehensive characterization that's now two years old as a program or young as I like to say because I've reached my middle-aged crisis so I don't like to use the word old anymore and the one of the translational now is one year young so what have we done over the past two years because that's really a two-year window that the program's been around so so here's what we ended up doing these programs are very complex the reality is you just can't get something off the ground and expect it to work you have to build your infrastructure so the first one that we launched was the characterization component the first thing we realized is that we had three main of what we call data production facilities or sites and we tried then to try the standardized the best we can or harmonize the analytical workflows of the way that they would be producing those data sets so that became very important for us and that pretty much took about a 12 month cycle for us at the same time they also released an additional three data sets to the public which is sort of a continuance of the last program but these are now freshly collected samples that have been optimized for both comprehensive genomic characterization and comprehensive proteomic characterization and that have to be for colorectal cancer breast cancer and ovarian cancer in the second year of our program we officially launched our translation alarms those are partnerships with our clinical trials and at the same time we also then continued in terms of that brute force characterization arm and we released an additional four data sets to the public one well two of them actually well in fact all of them occurred in the fall of this calendar year colorectal cancer we released endometrial cancer we released and we also released two additional pilot studies one focusing on 30 year old samples just trying to understand the stability of these bank materials and the other one was sort of a cell line study and we hope to release another one in the in the next several weeks now I talk about a lot we give all this information to the public the other question I get all the time is okay so you give all this stuff to the public is it being used it's like developing a business right if you guys develop a business and if nobody comes to your store and and actually uses your products your store typically won't stand business too long so I'm always paranoid you know are people going to use these materials I would argue given away your data and everything you find in a pre-competitive manner is truly advantageous not for your own program but at the same time for the globe as a whole for three basic reasons one if you give away the raw material just data sets reagents your standard operating procedures it stimulates outside individuals that don't have wet laboratories that are computational scientists that could reanalyze your data sets and hopefully develop new hypothesis to pursue science in a way that you couldn't figure that out a couple of years in the past secondly if you could take your raw ingredients work within the street to develop kits you could further disseminate that to the public and thirdly you hope that some of these kits or the reagents you could put them in a way together that actually could be used in a clinical setting so let me give you an example of all three of them so in terms of our data do people use the data sets a CPTAC it turns out it does it's it's very simple to get analytic metrics on it so our program has about 10 terabytes worth of raw and processed data files available to the public as of today we know that our data is being downloaded all over the world specifically just over 130 countries and actually of the small little 10 terabytes worth of raw files that those downloads have now exceeded well almost have reached 300 terabytes worth equivalent of our data sets in terms of the other components that I talked about we also give away assays for those targeted based assays that we develop so we have a portal we give away all the parameters behind the assays that we develop we currently have just over 1500 of these fifth for purpose based assays do people go to our website yes it turns out on a monthly basis over 8000 people are not going to our website and they're grabbing whatever information they want hopefully conducting studies in their own laboratories and actually those download sites come from almost 180 countries the vast majority of the assays actually do come from CPTAC but as we develop these analytical criteria for the public we're starting to allow outside investigators to deposit their own assays within our own portal now some of these assays do require a higher level of sensitivity if you want to measure endogenous levels in individuals so for that we do develop reagents those are antibodies so in antibodies we have almost 500 monoclonal antibodies that we've developed and fully characterized we give away all the characterization in the public and we give it away through different distribution arms one week one distribution arm is at very low cost through the academic model and the other one is through industry and we've been able to sell these units so we've now sold almost what 4 000 units of our antibodies which is really good for this little small program out of the National Cancer Institute and of course we just don't do proteomics in isolation we do genomics and we do imaging so all the imaging that comes from the histopathology lab or from the radiology lab we give it away into the public domain so it's not just proteomics we do genomics transcriptomics proteomics and imaging everything we put it in the public now another great example is this recent study in fact I ended up getting this paper from the director of my institute about a couple of weeks ago you go and his comment was have you seen this it was flattering that somebody else saw it and not me and here's why it's a neat little study so our program really wasn't looking at neo-antigen neo-antigen is like this hot little terminology that people use now basically looking at mutated components but at the end of the day this study comes out in cancer cell in a late summer and what they looked at was neo-antigen but they looked at publicly accessible data sets it turns out obviously the one that typically people will think about is cancer genome atlas but the part that I liked about it was is that on the front cover if you look at the image that they had the data sets that they pulled out to conduct their analysis actually comes from the US based CPTAC program why only because we place it in the public domain so there's a great example on how giving up the information we never explored neo-antigens within our data sets another investigator group is able to do it for us and again it further stimulates the science world now these are raw ingredients what do you work with industry to develop small kits so one of our colleagues is in Canada uh MRM proteomics so MRM proteomics develops kits and these are targeted based assays and one of the thing they wanted to know was how do they differentiate themselves from other manufacturers our comment was you might want to look at our analytical criteria and if you were able to adhere to them we'll host your assays on our own portal further drives traffic towards your company and at the same time you put out a kit that has a higher level of standardization than typically was out there within the research landscape and that's exactly what they did so actually when they now put out kits that they actually run it as an in-house service or the you could actually purchase your kits and run it within your core facility these are all research use only right there it'll basically tell you that they adhere to the CPTAC guidelines for their analytical kits themselves now this is still the research space what about the translational space and developed needs targeted based assays here's a great little study that actually came out a couple of weeks ago and this is a partnership from from one of our laboratories on the west coast that actually partnered with AstraZeneca and here's a great example how proteomics helps the therapeutic side of the landscape so in this one they were looking at is two compounds these are basically tyrosine kinase inhibitors and they're graphed to two pathways a taxis that has a lot of affiliation with DNA damage response but basically this investigator manna polovich at the HUD she developed a targeted based assays that looks at the DNA damage response that was a huge advantage for AstraZeneca and in the partnership what they ended up doing was they actually then identified a marker and this is a pharmacodynamic marker that PD marker actually helped AstraZeneca move these two compounds from a phase one study using now this PD marker and they're able to translate it into a phase two and it's being used to actually determine the dose that's actually is going to be administered to these individuals now this is still the translational space can you get it in a clinical environment we've actually played in that space here's one example now here's one where a lot of people try to find new biomarkers but again that's very complicated because you're trying to figure out new biology and believe me new biology towards patient care takes many years but that's okay because biology is complicated so we decided to do was to take the analytical techniques that we've developed and ask clinical laboratories are there existing tests that are problematic that might be alleviated if you were to bring this orthogonal measurement into your portfolio and in this case they went after thyroglobulin the reason they went after it is that you find out about 20 percent of the population individuals with with thyroglobulin they actually suffer from auto antibodies the auto antibodies the issue with that is is that it's going to interfere with the secondary antibody of analyzer so you get a lot of hook effects and basically you end up with false positives so the circumvent the 20 percent of the population that's missing out on a very good test we actually our investigators ended up developing a targeted mass spec assay dedicated against thyroglobulin itself that test today now is being used by every major clinical reference laboratory in the united states now this is still being used as a laboratory developed test what if you wanted to take it to the FDA and get something approved that's a whole regulatory path well that's an interesting space so this is what our investigators are now doing within this environment it turns out when you go directly back to the FDA they'll say well mass spectrometry we don't develop the standards or we don't tell people what to do we look at the community to come up with a consensus document once when there stood the process we said so what's one of the communities you look at it turns out there's a organization referred to as CLSI which is the which is the clinical laboratory standards institute aspects and what we ended up doing was the following in 2016 we worked with the FDA to put on a workshop dedicated toward mass spectrometry not again not mass spectrometry for metabolites but to move it into the measurement of this in this case your measurement was going to be a peptide that that ultimately then led in early of 2018 to an existing governing body of CLSI so they've always had historically a document referred to as c62a for using mass spectrometry in a clinical setting for the measurement of metabolites but not for peptides we're now working with them to develop one dedicated to the measurement of peptides and the goal is hopefully within the year 2020 it takes a lot of time apparently but in 2020 they're going to release a document that's dedicated for the measurement of peptides which is a lot which is basically the target in mass spec that a lot of people have been referring to now the other question I get is well are these technologies very specific to cancer it turns out they're not technology is ambivalent that's the beauty of it so here's a great example of it so at the national institute's a hell I belong to the national cancer institute one of my sister institutes the national institute of diabetes they basically put out a funding solicitation early this calendar year the reason I loved it was the following when we found it they talked about is that they're this that they're going to be funding laboratories in the u.s to develop targeted based assays against I believe type 1 diabetes yes but that's not the part that's interesting is taking proteomics at the diabetes the part that we like the most was they basically said it when you develop your targeted based assays you have to adhere to the guidelines developed by the national cancer institute's CPTAC program and more importantly you have to deposit the analogal criterias of your assays in the public domain so it sets a precedent for other people to be replicating that process now CPTAC I pretty much don't do anything in the program I have to admit I have the pleasure of being at the national cancer institute and overseeing this effort this is really a team-based program and this involves multiple institutions within the United States just a series of incredibly talented scientists it's been one of the most privileges I've had over the past 12 years but now this program actually has spawned this other sorts of initiatives of blending these two worlds together I hope after listening to today's lecture you are convinced that whether to choose genomics or proteomics which one is better probably you will not ask this question anymore and you will agree that both of these technologies are good but probably a good integration of proteogenomics could provide us much more meaningful information Dr. Rodriguez provided very good example that if you open a biology book what you find is the correlation which defines the complexity so both genomics and proteomics need to be understand thoroughly so that we can understand important questions for disease biology that means we all need to focus on the new area which is proteogenomics I hope you also heard various pathway networking correlation in the ovarian cancer project which shows new aspects new information could be obtained using proteogenomics and phosphoproteomic analysis he also provided you brief overview of CPTAC data portal which contains large number of data from 130 countries worldwide finally he provided highlights of some of the facts which are related to the targeted proteomics and how CPTAC is coming forward with different guidelines to standardize these assays in the next lecture by Dr. Henry Rodriguez he will talk about other programs and initiatives which are generating and managing multimodal data other than CPTAC he'll also brief about data common framework and cancer research thank you