 Welcome to MOOC course on Introduction to Proteogenomics. So, we are proceeding from the protein quantification running on the SDS page gel, doing the sample cleanup peptide quantification. Now you are ready to inject the sample in the LC liquid chromatography followed by the MS analysis. So, now it comes the liquid chromatography. Here we are using reverse phase column C18 material, peptides are going to bind to the column and then you want to elude the peptide based on the hydrophobic hydrophilic properties. So, you will use a different gradient using estonitrile 5% to 80% or you can go even up to 90% with 0.1% formic acid and this concept I have already briefed in the beginning in the previous lecture. So, I am not repeating again, but you need to pay attention to the parameters for what should be the best gradient for doing the liquid chromatography. So, what is shown on the screen here is again a refresher we talked earlier as well that you will use different parameters depending on the kind of sample you have and you would like to achieve a good Gaussian distribution of the peptides which are eluted from the column. You would like to see that you know very soon maybe after 5 to 10 minutes time peptides start eluting out of the column, then after as you increase the gradient of estonitrile more and more peptides are coming out of the column and eventually once you have reached to the saturation level, then finally all the peptides are out of the column and you are then washing and reclubrating it being ready for the next injection. So, that ideal setup should give you a Gaussian distribution of good intensity of the all the peptides. One need to play with this parameters, but again shown here that you need to work on each of these and now let us we will talk more about these parameters in the lab session. Hello, so here I am going to explain about mass spectrometry. So basically there are main two components one is liquid chromatography another is mass spectrometry. So, I am going to explain first about liquid chromatography. So, here you can see there are main two solvent which is one solvent A which having a 0.1 percent formic acid, another is solvent B which is 80 percent AcN and 0.1 percent formic acid. Now here if you see this screen, now here you can see there are pump A and this is pump B basically which regulate the flow of solvent A and solvent B. This is another pump which is pump S it control the taking over sample that how much amount of the sample has to be injected that will be controlled by pump S. So, here I am ejecting the tray where you can keep your sample now here you can see this plate and this is while where you can keep your sample. So, generally here we are loading around 1 microgram accordingly you can calculate the amount of volume how much you have to inject in mass spectrometry. Now you are familiar with the liquid chromatography and how nanoLC can be used, how different parameters, different pumps are important keeping a watch on the pressure is very crucial. Now you are ready for injection with the electro spray ionization. This is a very crucial part because now all your peptides have converted to the gaseous ionized forms and they have to move it inside the mass analyzers for further analysis. So, various settings and again the voltage these criteria that important and you are going to learn more about them in the lab session, but briefly refreshing you here again that while all these ions are coming your major effort is most of them can you move them inside the mass analyzer and that is where the pressure and the charge these parameters are very crucial and you need to make sure that most of the ions are actually going inside the mass analyzer otherwise the proteome coverage will be affected. So, next let us assume that you know you are you have done a good electro spray ionization then you are ready to separate these ions inside the mass analyzer. There could be different type of mass analyzers and different type of mass spray configuration. For example, we can have you know ESI QTROF a popular configuration or orbitrap in this case we are going to talk about orbitrap fusion technology which is having a tri grid mass spectrometer and let us have a laboratory session about using orbitrap fusion technology and different parameters for MS and MS MS analysis. Once you have kept your sample in nano LC and then you will monitor through the software. So, your sample will go through this tube and this is first column which is three columns where your sample will start to clean up basically desalting and your peptide will start to bind in this column and then waste will go through this tube into a waste beaker. Now, once your sample will be cleaned in the pre column then second column which is analytical column then your sample will start pass through this analytical column and they will start to fractionate and slowly the first hydrophilic peptide will start to elute and then hydrophobic peptide will start to come. So, once your sample will start to elute from this tip of column few can see here and then it will start get ionized. Now, here we are applying voltage around 2.2 kilowatt and your sample like highly charge and they will start to elute. Now, here I am going to explain parameter for MS. So, this is MS 14. So, these are parameter for the MS where I kept orbitrap resolution around 60,000 scan range 375 to 1700 m by 0. So, generally for peptide it is you know that is that is optimized RF lens I have taken 16. AGC target is like automatic gain control like how many ions you want to accumulate so that you have to define which I am taking here around 4.0 into power 5 maximum injection time is 50 millisecond and quality is positive because our peptides ionized and they are positively charged. In intensity you have to define the intensity how much intensity threshold should be there which is 5.0 into power 3. Now, generally peptide is charged so I have to define how much charge should be there the range of charge basically. Now, here you can see 2 to 6 I am taking it will take a single charge and then more than 6 it will consider only 2 to 6 charge peptide and then I am defining dynamic exclusion which is 42nd here mass holdings should be KTM and this is the parameter for the MS MS. Now, where I am keeping isolation window is 1.2 now because the MS is happening on the peptide now those prepare have to be fragmented and for that I am using the SCD cell high collision dissociation and energy collision energy mode set that is fixed and this is the energy which I am applying for the fragmentation for our peptide and the detector is here orbit rate and resolution for the MS MS is 15 thousand and first mass which has to be detected 100. So, these are the like parameter which I am so new these are the like already optimized for the label 3 quantitation if you are using like different type of technique if you are using the label based in case of eye dragon and KMT accordingly those parameter will be changed. So, accordingly you can do it change. So, this was the parameter for the LC and MS parameter. So, now you are familiar that how you can use different parameters and settings for the MS and MS MS analysis after doing the a good run from these experiments from the same sample. Now, you will see these chromatograms which is shown here on the screen which shows that you know the time versus the intensity of these peptides as I mentioned you would like to see a good Gaussian distribution of the peptides coming out of you know from your sample. Now, from these sample how to make sense of this information that you know what these proteins are. So, if you remember that we talked briefly about you know looking at b and y ions. So, again if you keep looking at walking through the entire chromatograms you will see the pattern of the spectra which will give you a good idea for you know the b ion y ion various are generated and now you can use this information for the for the database search. So, let us have another lab session to discuss more about these chromatograms and looking at these data. So, till now we have seen the different liquid chromatography parameters and the mass spectrometry parameters required for the success of a LC-MS experiment. So, as you now know that the char species enter into the mass spectrometer and get fragmented. These fragments are then detected and by use of a suitable software the identification of the peptide is revealed. However, just by looking at the chromatogram one can easily reduce whether the sample run was good enough for it to be taken to further analysis. Let us now take a look at the example of a very good chromatogram. On the screen we see a very Gaussian distribution of the peptides. This is the MS-1 chromatogram that means all the peptides which have entered into the mass spectrometer as char species have got detected at the MS-1 level. Further based on the abundance of each of these peptides they are fragmented at the MS-2 level and detected. It is to be noted that most of the peptides which are less abundant are likely to be ignored and only the high abundant peptides get fragmented. So, the bottom panel shows the MS-2 fragmentation pattern for the selected peak. So, if you can see here this is the MS-2 pattern for the selected peak. So, you see different fragments which have been generated and also the signal is relatively less noisy. This helps in better interpretation of the data for the software. So, we now move to another chromatogram which is not that great. So, on the screen if you try to correlate if you try to compare this chromatogram to the chromatogram which is previously shown you can very clearly see that the distribution of the peptides is not a Gaussian distribution. Also you see that there are gaps in the chromatogram which are indicative of issues with the sample and with the electro spray ionization. So, there are certain segments of the chromatogram where probably nothing entered into the mass spectrometer or the peptides did not get ionized properly for them to be detected by the mass spectrometer. If you now look at the MS-2 pattern the MS-2 here has significantly less number of fragments which is a feature of a very bad chromatogram resulting from a very bad sample. This issue could have been due to improper handling of the sample or due to issues with the column but this is the basic information that one can reduce from merely looking at the chromatogram at the MS-1 level and the MS-2 level. The raw data that is generated is then further analyzed using specific software which can reduce the information in the chromatogram and reveal the identity of the peptides subsequently leading to the identification of the proteins. I hope you got a very good glimpse of doing the mass spec base proteomics workflow but how one could use the same set of information these you know workflow for a you know any case study for any biological problem. In this slide I have invited one of my PHA student Shulina Mukherjee to talk about how she has used this mass spectrometry based proteomics workflow in her own research. Very briefly she will walk you through the steps and the strategies for data analysis and in a nutshell that how it can give you some biologically meaningful insight from a clinical problem. So now that you have an idea about the chromatogram that comes out after your mass spectrometry experiment we will talk about how to interpret it and how to do the protein identification. So you can interpret the raw data coming out using several freely available software as well as commercial software. So one of the softwares is Maxquan and another software that I will be giving you a little glimpse of is proteome discoverer. So what you what the software does is it takes the raw data into account and then it does the spectral matching and counting and also it uses a database that is a background database. So if your sample is from human origin you give a Homo sapiens database and then it does the annotation of the peaks. Also in the process of sample annotation from raw data you can also do a grouped analysis for example in this case if you are looking at a cancer sample wherein you are comparing the normal or the non tumor samples with the tumor samples you can give different grades of the tumor for example this is grade 1 this is grade 2 and this is grade 3 of a tumor and these are the normals. So when you annotate the raw files in such a way that you already specify that which group it belongs to the software then takes into account these considerations and then we it gives you the details of how much of the protein abundance is present across these groups so that you can know whether you have some dysregulated proteins that can be subsequently used for identification of biomarkers. So for the setting up of the workflow first of all we do the database search as you can see here we have used sequence ht and also these are the steps that are followed that is you can annotate the parameters that is for example you have put a parameter of mass tolerance which is 10 ppm then you have used the charge states and then the retention time also here you can see the spectrum files will be taken and then it will go to the spectrum selector then it will go to the database and then there is another workflow which is called the percolator. Other than these things you can also use other search engines like you can use mascot in parallel with sequence ht and then if there are unmatched spectra it can go to the next search engine also for example in this case it is the spectrum confidence filter and then furthermore you can also use other softwares for knowing into the other modifications present in the peptide for example if you want to know whether your peptide is glycosylated or phosphorylated then you can use these kind of these kind of filtering to to annotate those changes in your peptide. So how does the data look? So as you can see after get going through a lot of filtering criteria the data that comes out is of high confidence because you have put on stringent filters so that whatever hits you get are the true data and not false positives. So here you can see that there are different tabs associated here for example this is the this is where you have annotated whether the protein FDR confidence how much is the level. So here you can see all of these proteins that have been identified have high FDR. So false discovery rate is the statistical value that estimates the number of false positive identifications among the peptide and it is also measure of certainty for the identification as in how much you are confident that the protein you have identified is a true match. Then we also have a contaminant database which we can plug in in the workflow the contaminant databases will actually indicate presence of keratin or serum albumin which are high abundant protein and often are responsible for giving false positives or masking your actual protein of interest. So as you can see here also in our data we have got serum albumin and the software is marked here as true. So while all the other proteins there is false in case of serum albumin it is marked as true so you can remove this kind of protein in case of your subsequent analysis. Other than that you will also have a plethora of information like the unique proteins. So the unique proteins again gives you an idea as to how confident you are of identifying the particular protein that has been ascribed to. For example, the first hit which is a neuroblaster differentiation protein has 370 unique peptides that means the mass spectrometer has encountered the peptide 370 times or it has actually annotated the protein the software has annotated the protein with very high confidence. Then you can also know about the score which is the score that is given to by the search engine and many other details. So thus you will have all the information available after you have run the protein through after you have run your raw file through a software and the software has different tabs that you can customize and you can set your parameters that you are looking for. So basically we have started from here that is we have taken a cancer sample and normal samples we have used a patient cohort then we have extracted the protein and then we have run the samples in the mass spectrometer. So after doing all the all these exercise you have got raw values you have got different peaks that you now you need to know that what are they. So you have used a software which I told you that you can either use a freely available software like Maxquant or a licensed software whichever resources available to you and then you can use the software and the various parameters to now do the data mining. So after the data mining what you are actually looking for is something like this wherein you can see a clear difference between condition A and condition B. So as you can see there are a signature list of proteins which are highly abundant in condition A and there are a signature list of proteins which are highly abundant in condition B. So these are indicative of the actual biological changes that are happening in the patient sample. Furthermore you can do the data curation and network analysis using again doing several bioinformatics software like string, DB, Metabo, Analyst etc and now you can see that after doing the whole exercise of using mass spectrometer and the software for annotating the data we get the we can map the proteins in various networks like this and then we can also see which are the ones which are classifying the different grades of tumor or the are different between the control and the cancer samples. For example these are these are a set of markers as you can see. So you can see in this one this is very high in the C sample which is the control sample and relatively low in the grade 2 samples. Similarly there is a reverse trend for this protein you can see QO 4637. So this is again showing a sequential increase that is it is low in the control samples and going high as the tumor progresses. So thus using these kind of tools you can answer a lot of biological questions. Alright so we started with a workflow of mass spectrometry based proteomics. We talked about how to do the protein quantification, protein digestion. Again we have talked about peptide quantification. Then you are ready for doing the LC-MS-MS based analysis. So liquid chromatography and MS parameters, what we generate the chromatograms, how to interpret chromatograms and how to review the whole data set make more meaningful insight from this data for the clinical case studies. I hope this gives you very basic of course it is not so detailed but you know a good glimpse of the workflow involved in doing MS-based proteomics. A lot can be done using mass spectrometry based proteomics. We have just talked about protein identification and label-free quantification workflows. You can also think about quantitative proteomics which I talked in the theory classes earlier about using iTrack or TMTs and those workflows can be very useful as well. But as long as you have done this workflow of sample preparation and their separation very well then you are ready for the quantitative proteomics based workflow as well. Thank you.