 Welcome to MOOC course on Introduction to Proteogenomics. In the last hands on session, you are introduced to the basics of mass spectrometry data interpretation. This knowledge is very essential while writing the course for creating softwares that can analyze the large scale mass spectrometry based data. In today's session, Dr. Karl Klauser will show you more complex spectra and make an effort to interpret the data manually. So let us welcome Dr. Klauser for today's session. Alright, let us take a look at the next one that we have here. This one I am going to tell you, give you some events warning that this is not a triptych peptide. This comes from an experiment where we are doing an immunopeptidomics, so MHC or what we call HLA peptides, human leukocyte antigen peptides have been isolated. And this comes from the allele that is B57, ok. This particular allele has a motif where the C-terminus is used to tryptophan or phenylalanine, ok. The amino acid mass for tryptophan is 186, phenylalanine is 147. If we add 19, we would get the Y1 ion mass. So Y1, if it was a tryptophan would be 205, and if it was a phenylalanine it would be 166, ok. This is consistent here with Y1, if this is tryptophan, ok. 159 is also listed on the sheet that I gave you as an ammonium ion for tryptophan, ok. This is 17 Dalton's less, so that would be Y minus ammonia. All right. We don't have a whole lot of symmetry here, so we don't readily have BY pairs, do we? Maybe 292, I guess we do have one, so this, all right. Then the mass difference between these guys is 28. Mass difference between these guys is 28. So what ion type is the 904 peak, B or Y, all right, 28, B minus 28 is A, all right, ok. And what's the mass gap between that is 83, 87, what amino acid is 87, serine, right. All right, I am going to add an extra peak here. This is 1108, ok, and that mass difference there is 204 minus 18, 186, ok. So tryptophan is what we say is that the N-termines, C-termines, sorry, all right. This mass gap is 113, ok, so that would be leucine, ok. What can we do next? This is also 28, right, so we are still BI, ok. Then if we go all the way here, that is a long distance, but that is 163, what is 163, tyrosine, tyrosine, all right. What else can we do? So let's see if we could keep going, this was a BY pair, so this should probably also be 87, all right, so that much is what I can easily do, let's take a look and see what the best we could expect to do was, ok. I said YLW, this is saying YS, so that is correct, YSW, and then we have, so there is long enough distances between these things that we do not have individual sequence, ok. This BI in here, right, is in between the R and the P, ok. Some of these other ions here in the spectrum are giving internal ion type fragmentation, ok. So there is, this right here is saying PD is at 213, and then you can go up to 376, right, that is adding tyrosine, then there is also an ion that is fragmenting to give DYIS, ok. So that is the extent of sequence that we can easily determine, all right. I am going to go on to the next one, all right. So this one is a good example of why I like to start at high mass. This one is also not a tryptic peptide, I could tell by the file name, where it says HLA in it, so this is again immuno, and then the, sorry the allele is B57, so it is again something that should end in tryptophan or phenylalanine, right here, ok. Ok, wait a minute, so that is 71, 99, let us put up this 1133, ok. What can we do? This is 101 and 38, 71, ok, 547, 53 goes to 600, this is 129, all right, so we learned this last time that 205 is Y1, ok, ok, so this is Y1. So this would be B2, and that mass difference there is at 18 or 28, it is 18, ok, B2. So that would be the 71 and the 101, and then we should be able to go 71 more or not, ok. So you want to call that 186, right, so that would be another tryptophan, that would be 156, that would be, that would work, ok, arginine, tryptophan, valine, D, not enough information to get anything else. All right, so this one is another HLA peptide, but the allele is CO701, I forget what the C-terminus is for that, so we are going to figure it out, all right, all right. So where do we start here, so 1064, 120 is an ammonium ion for phenylalanine, 101 is for Q, or K, phenylalanine's Y1 ion should be 147 plus 19, which is 166, ok, so Y1 for phenylalanine, that is a minus 113 gap, that's 128, yep. 65, 211, 147. Right here, right there, yep, that's 147, ok, that would be, that's 161, no it's 166, okay which plus 18 right okay so again that would be the 147 F this would be Q L F and these would be that would be Y and that would be B I this is 28 right it's B I 28 B I alright what else can we do here it's again 28 alright I'm gonna stop there and let's keep let's see what the answer says FLQF okay that's what I had and not enough information to get anything else the 276 Y2 okay that's here this is the 128 okay these are doubly charged ions here okay so this one is not fragmenting very completely alright so and then I think pretty sure this is the last one alright what's your first impression of that spectrum it looks it looks a lot more complicated than the other ones right okay so part of the reason is a little bit longer so things are spacings closer together but there also seems to be a bunch of peaks that are close together okay and if you look close you could see that this these peaks are separated by 18 so this is 18 18 18 that was 28 alright that is 99 37 53 129 do you want 71 this is a triptych peptide so it's arginine or lysine but we don't have it down to low mass it's gonna be lysine okay so I think this is this lysine LL yeah well I didn't intend to do that let's go back here let's keep going f 66 5 66 7 20 oh it wants to go down to that one all right well pretty close okay all right that's that's the end of that so hopefully what I've tried to give you is a feel for how much how incomplete the information is in some of these spectrum okay any other questions before before we adjourn yes how did we know what yeah yep okay okay right so the way that I've said after I've told you this is all the information is and then I tell you magically this is the answer your good question would be how do I know them magically it's the answer well the spectrum all by itself has no more information much really than we could that we were able to determine there's uncertainty in what the sequence is but if the protein is derived from something in the human proteome the possibilities that would allow for the uncertainty in our interpretation are not present in the proteome and so that's why that's why I'm describing it as a as that is the answer okay but let's suppose that after you'd had an answer none of them are in the human proteome then how do you know what's right well you don't okay if you have a lot of spectra that you you could do good de novo on and they are not in the sequence database you're looking in maybe you're looking in the wrong place right or maybe your your database is not complete okay there could be there could be mutations there could be parts of the genome that are not that are not a part of your database maybe your sample has contaminated with something else now there's another aspect to this that that I know but you don't is that when you get this loss of water happening like this that often happens when you have a glutamic acid at the end terminus of the peptide okay you can get losses of ammonia that look like this when you have a glutamine at the end terminus of a peptide so there's you know there's certain aspects of the amino acids have have chemistry associated with them that after you see enough examples starts to become more rule than exception okay okay so there there is chemistry behind these things okay slashes the slashes I just use is nomenclature to indicate what kind of fragmentation is happening between the residues I use a red slash to be a y ion a blue slash to be a b ion a pink vertical bar means there's both okay that just that's just I that's not a universal thing within the field right but it allows you when you do a lot of this to just look at the thing and know what's going on okay I hope today's hands-on session helped you to appreciate the complexity of mass spectrometry data resulting from either improper fragmentation or presence of amino acids like protein that creates problem or release of molecules such as water or ammonia due to the presence of amino acids such as glutamic acid or glutamine at the end terminus the increasing knowledge of interpretation of data by practicing more examples and even remembering the inputs provided in this session will help you greatly the mass spectrometry based proteomics have really taken big base for any kind of biological problem there is no need to look at the complex proteome analysis if you are able to interpret your mass spectrometer you can greatly benefit I just like to emphasize that sample preparation and data interpretation are the two most crucial steps you as a participant as a learner can prepare your own sample send your samples to some facilities where they have the advanced mass spectrometers such as ours at IT Bombay and then after running those samples you can obtain the raw data as long as you have no ability to interpret your mass spec data then you are pretty much in a commanding position to get the best from your data set the please do go through today's lecture as there is a previous lecture and the hands-on sessions in more detail thank you