 Welcome to MOOC course on Introduction to Proteogenomics. In the first module, you were introduced about latest developments in the area of genomics. Our eventual goal of this course is to introduce you the concepts of proteogenomics. But you would like to develop the concepts of genomics first, proteomics and then try to integrate both genomics and proteomics. So, in this module, we are going to talk to you about proteomic technologies, some basics, more advancement in the areas of mass spectrometry and how one could analyze the data using various latest tools available. So, let us first start about some of the basic concepts of proteomics, especially an overview of proteomic technologies from discovery to function. So, let us first look at the taxonomy of omics. I am sure you have been listening variety of terminologies linked to omics field. Omics field aims to look at a given system in its totality. For example, all the genes or DNA of a given system can be studied under genomics, all the RNA or transcripts under transcriptomics, all the proteins under proteomics, all metabolites as metabolomics. Let us say if the context was to look at all the possible transcripts, genes or proteins, then you say that we are going to study the global proteome or global genome analysis. But if the concept is linked to a specific cell or organ, then that is cellular or organelle proteomics. So, in this way, I hope that you know now you are clear about layers of information and different type of omic terminologies. All this started from the success of the human genome projects and the success of genomic technologies, especially in the 1990s that time various genome projects were in progress and especially 2001 and 2003 during that time the draft human genome maps were published. It was one of the major accomplishments of first time knowing about all the possible human genes. While this was you know one of the major breakthroughs which has happened in front of many of us, a major milestone in biology. But when people looked at variety of model organisms, especially fruit fly which is Drosophila melanogaster, roundworm, C. elegans, Thale crass, Arabidopsis thaliana, human, Homo sapiens, variety of model organism showed the number of genes are not very different in different systems. The numbers may vary around 17,000 to 20,000. So, what makes different systems so different morphologically and physiologically? So, eventually the focus shifted from the genome moving on to the protein level investigation or proteome. Let us look at some of the basic information basic biology from the genomic DNA. After transcription the RNAs are formed and in the process of splicing you can see that you know the exons and introns are there and when introns are out the functional mature mRNAs being formed with combination of various exons. Now, during this process from pre mRNA to mRNA or the mature mRNA formation, you know variety of combinations can happen different exons can come into different combinations to give rise to different type of mature mRNA which may eventually result into different protein forms. So, from same gene different transcripts and different proteins could be produced and further now this information could be much more complicated because each protein may also gets modified based on the attachment of phosphate residue, methyl residue, glucose of the sugar molecules, hydroxyl residues many of these modifications known as post translational modifications are very relevant for the functional information of the protein. So, looking at this information now let us look at this cartoon where we can see the inner sphere that is the genome then the transcripts are being formed the blue circle shows the transcriptome and then eventually the proteins are formed and gets modified that is the proteome. So, study of proteins and their properties to provide a broad integrated view of cellular processes is known as proteome and the field which aims to study that is known as proteomics. I would like to highlight that after genome reference maps which are published the eventual goal of the entire community was to look at possibility of all the proteins and trying to obtain the evidence that these proteins do exist in human with various type of experimental approaches like mass spectrometry and tissue based microarrays. All those efforts took a lot of time but eventually 2014 and 15 the first draft human proteome map was published which tried to capture and provide evidence that proteins do exist for all the genes which were already decoded and then you know almost 17,500 proteins where you know the evidence for those were provided in two different seminal papers you know based on the mass spectrometry in 2014 and then eventually based on the tissue based map of the proteome in 2015. So, these are again the major milestone accomplishment after the draft human genome maps we now have the draft human proteome maps available. So, broadly if we want to study the proteomics or the field of proteomics you can think about there are two major objectives. One is to look at the new discoveries or new kind of targets in context of a given disease or given stress or given comparison which you want to make or you want to understand the function of the proteins. So, one could term as a discovery or abundance based proteomics another can be known as function based proteomics. So, discovery based proteomics aims to look at the new protein molecules which could be showing change in abundance because of the disease or stress condition or drug treatment and what are their identification and what is their quantification. So, these information could be obtained under discovery or abundance based proteomics. Then one could look at function based proteomics where aim is to identify the function of unknown proteins or the new proteins which you have discovered from the discovery set. So, where the proteins are expressed the assays to try understand the function of these proteins are the major goal of functional proteomics. So, if you look at the technology which could be used for discovery proteomics essentially various type of gel based technology like 2D gels and DIH and mass spectrometry were quite handy to use the discovery based workflows. And then how to look at the functional information was based on the protein microarrays and SPR surface plasma resonance based technologies which provided more information at the functional level. So, much more gel free kind of approaches more high throughput approaches were used for the functional studies. So, one could say that you know there are various facets of proteomics starting from the beginning of the proteomics field which is started with gel based proteomics or the gel free proteomics which is predominantly built on the mass spectrometry and various label free biosensors or functional proteomics where you want to study the function of the given protein or target proteomics where you want to target a specific protein of peptide sequence and do their validation in the high throughput manner. I will give a very brief overview of you know each of these domains. Of course, there are separate courses which I have offered in the past where you can get much more detail about each of one of these technologies and the full modules on those. But let kind of capture very briefly the you know various technologies available for studying proteome. So, in this slide gel based proteomics comes very first because all of you have studied at least some level of gels, LDS paid gel or even two dimensional gel electrophoresis or the advanced form of gels which is known as DIH or difference in the electrophoresis. So, the study of you know looking at the proteome or separating proteins on the gel have been from very long time electrophoresis field has existed from 1950s. And especially the 2D gel came into the light in 1975 when Claude and Farrell the first time showed that one could separate proteins based on their two properties of molecular weight and isoelectric point. So, this is the you know cartoon diagram showing you the concept of two dimensional electrophoresis where the first dimension protein separation can be done on the IPG strip or immobilized pH gradient strip in the isoelectric focusing field based on their PI values. When pH equals PI there is no net charge and protein will stop migrating in the electric field and this is how you can separate the proteins based on their PI values based on either 3 to 10 IPG strip or 4 to 7 IPG strip you can choose as per the relevance. Once you have done the first dimension separation then put the same strip on the LDS paid gel unit and now you can separate proteins on the second dimension based on their molecular weight. So, then eventually you generate an image which is shown in this particular spot where you can obtain information for each protein at their molecular weight as well as isoelectric point level. Now let us say that your goal was to do the differential proteomics or abundance based proteomics to look at the new targets in context of a given you know condition A and condition B comparison. So, you have done the protein extraction, solubilization of the proteins and then separated them based on their isoelectric point in the first dimension after reduction and alkylation proteins are separated on the LDS paid gel in the second dimension. You can now separate those 2D gels you can now strain those for visualization purpose using either commasis strain or silver strain. Now these 2 gel patterns will emerge from condition A and condition B. After analysis you can see that you know some spots are differentially expressed and if those are reproducible you can excise or cut the spots and then do the protein identification using mass spectrometry. While this is all you know very straightforward very nice technology but because of the reproducibility issue because of the running artifacts and because of the you know the low throughput the number of protein coverage issues a new technology you know was much more easy to accept that was DICE technology which is difference in gel electrophoresis especially for the quantitative proteomics. So, in this particular technology let us say you have 2 samples for condition A and condition B which you want to compare. You can now label them from you know Psi 3 and Psi 5 dyes. These are fluorescence dyes from the sinine and then in addition to these 2 dyes you are also adding making a mixture of the both the samples from sample 1 and you know sample 2 as a reference pool and then use that reference pool to label with Psi 2 dyes mix all the 3 samples together done on one gel and then you are going to obtain the 3 images after scanning for Psi 2, Psi 3 and Psi 5. So, in this light now you got control and treatment Psi 3 and Psi 5 images and Psi 2 as the reference pool. Let us say if you have to run 3 biological replicates in all the 3 biological replicate Psi 2 image will remain the same because you have already made the pool of that and the control and treatment images will change. So, this became much more reproducible much more quantitative and if we now want to summarize I have just shown you a data from the lab where now you can see after comparison these are the 3D views of the spots of from the dye gel. Bottom panel is for the control spots and the top panel for the disease spots. So, looks like something is very interesting 3 dimensional view like showing you know higher expression of this protein in a given disease context. So, if you want to summarize the overall what we learned from the gel based proteomics that gel based proteomics are you know very handy very simple easy to do technology. But because of the reproducibility issues and limitations of gel to gel variability the focus shifted more towards the quantitative proteomics based on the diage. Diage is a multi-placing technology which offers much more sensitive and much more reproducible way of doing gel based proteomics. However, the major challenge of both the 2D gel or diage gel remains that after you know looking at the significant protein spots you want to identify them using mass spectrometer. But from the gel when you excise a given spot the protein amount present in that spot may be too little too low for you to you know reproducibly capture that information and then get any prepared identification. So, even now you see this particular 3D view spot looks pretty nice and you know biologically very relevant, but what this protein is we have no idea no information. So, can we now you know identify the protein using mass spectrometer and if we can do that then only we will have any sort of biological information. But if mass spectrometer fails to identify this information because there is not sufficient peptides available then I think our all effort have failed. So, therefore, gel based proteomics is you know great it can do many things, but is still the success of identifying the protein of relevance could be only 50 or 60 percent because you cannot identify all the protein spots. So, a lot of focus shifted in the overall proteomics field from the gels and moving to the gel free proteomics especially based on the mass spectrometry. So, let us now shift to the next module especially looking at mass spectrometry based proteomics. I must say this particular technology has almost you know influenced very positively the entire field of proteomics and even when metabolomics and as a result you will see a lot of developments happening in this area. I will probably give a separate lecture in my next lecture just focused on the MS based proteomics, but let us kind of you know very get very brief overview of mass spectrometry based proteomics. The mass spectrometers they are aiming to provide you information based on the mass to charge ratios of these particular you know your peptides to give you identification and quantification and to achieve that you are trying to separate these ions in the electric and magnetic field. So, definition is mentioned here that this is a technique for protein identification and analysis by the production of charge molecular species and their separation by magnetic and electric field based on their mass and charge ratio. This image is shown here is for shotgun proteomics. You do not want to use the gels, directly solubilize the protein, do the you know reduction, alkylation followed by treat with enzyme like trypsin for digestion. Now all the proteins will convert to the peptide form. These peptides can now be you know cleaned up for the salt and other debris. You remove those artifacts and now the clean peptides could be directly subjected and injected to the inside the mass spectrometer for doing MS analysis or MS MS analysis. In this area you will also use the chromatography like especially strong cation exchange or reverse phase for peptide separation and then you will use ionization source like electro spray ionization or MRE to ionize these peptides into the gaseous ionized forms. Then separate them based on their various M by Z properties in the mass analyzers. The different type of mass analyzers available could be you know quadrupole, time of flight, ion trap, orbit trap, many type of configurations are available and then one could use MS MS to generate the peptide spectrum and then use databases for peptide search and eventually you will get the hit for the protein identification. In just nutshell of course you know we will touch upon some of this concept in much more detail in the next lecture. But I just just trying to give you the overall you know field of proteomics especially gel base we covered and now I am trying to give you the field for the mass spec base proteomics especially shotgun proteomics. So, if you want to do quantitative proteomics using mass spectrometer the way we talk for gel based where we have used the you know concept of DIGE technology. Here one of the successful example is i-track based quantitative proteomics, i-track is isobaric tagging for the relative and absolute quantification. And here again you can take now 4 samples or 8 samples or even you know a similar technology like TMT or tender mass tag have come forward where you can study even 10 plex studies. These are isobaric tags so it means you are not changing over addition of the mass on any of the 4 condition or 8 condition. So, in this way you know we have this i-track label 114, 115, 116, 117 report the balancer region is 31, 30, 29 to 18 overall mass added is 145 which is all same in all the 4 conditions. You have labeled the peptide of 4 different conditions mix them all together and do MS-MS analysis. This cartoon shows you that you have 4 conditions you have done protein extraction followed by digestion from those label the samples within i-track labels, combine them, do the you know some sort of chromatographic verification if required and then followed by LC-MS analysis. At the MS-MS level you can see these reporter ions could be visualized 14, 105, 116, 117 and this information could be used for the quantification of peptides. So, this is the you know very nutshell or very brief about using mass spectrometer directly for the quantization you are not using any gels in this approach. Next technology which is coming forward is target proteomics which is also based on the mass spectrometry revolution where aim is to look at peptides directly using triple quadrupole based mass spectrometers and do their validation. So, one approach which has come forward very promising is known as SRM selected reaction monitoring or MRM multiple reaction monitoring. And in these you know technology approaches these are says now you will see there are triple quadrupole mass peaks being used for example Q1 which is a mass filter. Here you are going to select a given peptide so you already identified the peptide sequences of interest from discovery workflow. Now you want to validate a specific peptide. So, you are selecting those you know specific peptides now let us say shown in different colors of red blue and black. For the red one now you are doing the fragmentation in the collision cell and then further analyzing its you know product ions. So, you are looking at the precursor and product ion pair which is referred as the transition. And then you are monitoring at least three transition of a given peptide something shown in these cartoons. And you are looking at at least three peptides for a given protein to be very confident about your quantification. So, now using this information that for a given peptide you have measured at least three transitions. And then for a given protein you have taken at least three peptides. Now you can validate peptides directly using mass spectrometer. And of course we will have much more detail eventually in the you know in the lectures as well as some of the demo sessions where you will be introduced to the concepts of you know promising software like Skyline how to analyze the target data and how to conduct these experiments using target proteomics triple quadrupole mass spec in the lab setting we will also have a demo session on that. One could utilize these information for the you know actual clinical work. For example, now we identified some specific targets for one of the studies in the brain tumor. And now we are only looking at that target across large number of patients. These are the patients suffering from you know various grades of gliomas grade 2, grade 3 and grade 4. This particular protein L-lectady hydrogenase we are trying to measure only that peptide across all of these patients. And we see there are much more abundance of this peptide in grade 4. And something similar is that we are also seen in the discovery workflow when we use the eye track based work. So, you know you can see that there are different technologies which have complementary nature which could be utilized to to complement the information. And rather than now generating the antibodies separately for doing validation, you can just use the mass spectrometer even for the validation. And of course, once these validation looks very promising on large number of samples, then you can make synthetic peptide which are heavily labeled and then try to do the more accurate quantification. Once even you have passed that kind of you know road block and you see that now there is a you know detectable amount very reproducible, accurate quantification of a given peptide from the clinical samples. Then you can take it forward next level, you probably now want to develop a clinical assay directly using mass spectrometer or you can raise antibodies or some other kind of biochemical assays which could be utilized for doing the much simpler based assays like ELISA or Western broad for the clinical cruelty. So, this can be you know a path in the workflow for taking your you know the project of interest from discovery to the you know validation and followed by taking to the clinical translation work. So, I know I am kind of you know going fast, I am trying to cover variety of fields in a very very nutshell. Each one of these require lot more detail for you know studying about the details of each technology. But let us kind of capture the at least the breadth of the entire field of proteomics and especially we started from gel based move to mass spectrometer and now let us shift slightly towards the intractomics and functional proteomics technologies. So, variety of you know technologies are being developed in this areas of intractomics where aim is to look at how a protein interacts with another protein, another biomolecule or a drug polycule and how to identify their function or at least get a glimpse of what could be possible function of an unknown protein. In this slide traditional approaches like yeast to hybrid or even affinity chromatography immunoprecipitation were heavily used. More recently we have started using protein microarrays and label-free technologies. This complex slide which shows you variety of platforms available for doing protein microarray based work. Let us start you know understanding it you know in the small pieces. If you have printed antibodies on the glass slides that is let us say you know these 3 cartoons which is known as abundance based proteomics. So, you are trying to measure the abundance of a given protein for which you already have an antibody and then the way you can develop your assay it can be direct labeling it could be sandwich assay or it could be reverse phase protein arrays. In reverse phase protein arrays you would like to you know take the tissue lysate or cell lysate and then probing for a given antibody to measure its abundance. So, this is known as abundance based proteomics where aim is to measure the abundance of biomolecules by analyte specific reagents or you can also look at the function based protein microarray approaches where aim is to study the biochemical properties or the protein interactions for which either you have purified a protein and printed that on the chip or you know the specific peptide sequences and you have tried to use those information for the you know printing on the chip or you can just take the cDNA directly for you have not purified a protein but directly you are taking the DNA or the cDNA print that on the chip and use in vitro transcription translation mix on top of that and then try to synthesize protein on the chip. This field is known as cell-free expression microarrays or cell-free based proteomics. So, the technologies like you know nucleic acid, programmable protein arrays or NAPPA or multiple spotting techniques are promising approaches to do the cell-free expression based microarrays but all of these four which I have showed are possible ways to look at the function of the proteins. So, in some way that you know there are a variety of platforms available for doing the proteomics especially when the interactomics using microarray based workflows. Additionally another technology if you are now aiming to look at more quantitative information for a biomolecular interactions then you do not want to label the proteins with any given fluorescent or you know fluorophore. So, here your aim is to look at the biomolecules in its native state and you are looking at in the label free manner. Surface plasma resonance or SPR is one of the promising technology in this manner which measures the change in the percentage reflectivity. As a result of you know the two biomolecules comes into the binding you will see the change in the percentage of reflectivity or in the reflective index of the medium will change and therefore you can see the binding is happening or you know still very weak binding is happening and that is you know measure under the sensorogram which I will show in the next slide. But here this cartoon what it shows that you have a gold slide, you have a light source and you have immobilized some antibodies of interest for example you are you know passing your proteins of interest to measure that there is no there is a binding happening for this protein or not and then if there is a binding then you will see the change in the percentage reflectivity and followed by you will monitor the sensorogram. So, if you now look at what you obtain from this kind of experiment initially you have you know a straight baseline you are now measuring the sensorogram which is a response unit versus time scale. The baseline is very straight and after you know a binding starts happening on the chip you will see the on rate and then you keep flowing the buffer so that you know the molecules cannot dissociate and that will come under the dissociation phase or the off rate and then you want to utilize the same chip for further experiment and that could be studied after regeneration when now you can use some of the mild assets to strip off your binding. So, this is you know the change in the SPR signal versus time is known as SPR sensorogram shows you the binding activity and provides the information for the on rate, off rate or the KD values of a given interaction. A new advancement in this area is SPR imaging where now the aim is to look at you know something like you know combining microarray based concept to the SPR based concept. You want to study the molecular interaction in high throughput manner like microarray but also want to get their KD values and the you know quantitative information in the high throughput manner and that is the SPR imaging is one of the promising approach. Additionally one could look at identification of new interactors let us say you have a you know an antibody which you have immobilized on the gold slide you are passing SL lysate or tissue lysate and you are you know expecting to you know bind some of the unknown proteins on the antibody. So, therefore, you know you want to now identify which are the potential interactors which are binding to this you know antibody of interest. So, can you now identify those interactors and that is where you need to bring in the SPR MS a new technology approach. So, can you now you know strip off this particular you know bound molecule or the analyte which is bound on the the gold chip and then generate sufficient of those peptides. So, that now you can run on the mass spectrometer we have tried to optimize some of these technologies and after you know multiple run cycle you can now generate sufficient amount of peptide which has given a detectable amount of peptide in the mass spectrometer and you can identify these interactors or the proteins of interest. So, finally what we are doing we are trying to build the layers of protein information using variety of proteomic technologies where proteins are localized, how they interact which are the substrate for these assays and variety of ways we are trying to understand the protein function which is very very complex. So, in this light I would like to conclude that there are many promising proteomic technologies are in development. You need to be very cautious and clear which technology platforms you are going to use for your addressing a biological question. All of these proteomic technologies have made you know huge revolution of understanding various biological system, but depending on your need you can choose some time agile based proteomic technology or directly mass spectrometers or use micro or SPR or any other technologies available to address the right biological question. So, depending on what is your question what you want to obtain the answer you can choose the right type of complementary proteomic approach. I hope it kind of gives you the you know the good foundation and at least the overview of the field of how various type of proteomic technologies could be used for studying any biological system. As I go along and as you know we are going to cover this part in the workshop we will talk more about mass spectrometry based workflows and different type of data analysis tools available which will help you to now get the depth of this field especially how to utilize in your own research. Thank you.