 Welcome to MOOC course on Introduction to Proteogenomics. Today we have a guest speaker Dr. Joakim M. Schweng from KTH Royal Institute of Technology. Dr. Joakim will talk to us about affinity proteomics which is a field of proteome analysis based on use of antibodies and other binding reagents as protein-specific detection probes. We also talk about the study of human plasma proteome using affinity based methods which could enhance biomarker discoveries, validation and integration from basic research towards the clinical usage. He will then talk about the resources like Biobank Sweden and Atlas antibody. Dr. Joakim will also talk about mass spectrometry technique and how it can be used to study post-translational modifications, PTM peptides in a digested sample. He will then talk about PTM scan technology which allows identification and quantification of hundreds to thousands of even the lowest abundant peptides and provides a more focused approach to peptide enrichment than the other available strategies. So, let us welcome Dr. Joakim for his lecture. What I would like to talk about today is to give you a bit of a different perspective on what we do and what we understand by doing plasma proteomics and maybe there are some aspects of it that could be helpful for you and that sort of provides some ideas of either collaboration or you know for you just to get a new perspective on in your projects and how to move forward. So, this is my team so we are currently about 10 people and it is actually Kimi the person she is half Japanese to the left who made this painting. She is very sort of skilled in arts but I think it is also a nice way to sort of you know illustrating us in sort of one group of people with sort of the same phenotype, right? Even though you know we have Philippa, she is from the UK. We have Mungwan, he is from Korea. We have Ragnar, she is a post from Germany. We are a very international group and now actually we have a new person from Denmark. So, it is really sort of you know the mix of cultures and the mix of backgrounds that I think is really sort of important. So, I guess Fredrik has talked to you last week about these different aspects of the human proteome atlas. So, the tissue based atlas, the subcellar atlas as well as the pathology atlas. So, I am going to leave you with that and I hope you still remember some elements of it. So, I will talk a bit more about sort of what is actually outside of the cells. So, talking a bit about the plasma proteome as we see it and then how to use affinity based methods for studying the plasma proteome. I will give you some examples of how we use mass spectrometry, but the predominant part will be actually on looking at plasma proteins with affinity reagents. So, as Sanjeva alluded to, I am the current chair of the human plasma proteome project and whatever that means is sort of you know to be defined, but I think what it sort of is meant to be is sort of an organization that tries to give a global understanding of what are the initiatives that people are working on in the different areas of the world. And with a common feature of studying the plasma proteome. And I am doing this together with Eric Deutsch who is a famous biometrician from Seattle as well as Erich Natowicz from Australia. So, really trying to you know have this as a global initiative too. So, about two years ago, we published this paper in the annual special issue of JPR and we are actually in the process of preparing a new sort of review for the coming issue. Where we basically concluded that there are about 5000 proteins that we can detect using proteomics methods in plasma. Which is probably you know 25% of what the genome actually tells us there is. Of course, this is predominantly driven by the fact that these are the things we can measure. It doesn't mean that these are the things that actually are useful, okay? So, given that you know you have new technologies such as Somalogic who claim that they can measure 5000 proteins. This is within the ballpark of what we see at the moment. Mass spectrometry in combination with other affinity based assays can measure. And there is of course one intrinsic challenge to using for instance mass spectrometry. And that is one is of course you need to have a good detection system and protocols to increase the coverage. So, from around 1000 proteins which we could identify some 10 years ago. You know I think we have made a big step forward in detecting more. And this is also shown here by the charge this Venn diagram showing the progress. There are also interesting numbers here highlighted in red. Which are those proteins which actually sort of got introduced over the recent years. And it's particularly I think you have these two gaps between 2010 and 2013. But they're also about 700 proteins which disappeared from these lists. And meaning that these are proteins that have probably been misannotated. And probably glycosylation forms that have been sort of led to the false identification. So I guess you know we know that the end of sort of having the perfect system together. But I think we have a much better understanding of what the system looks like. Another challenge in mass spectrometry is of course the coverage. Meaning how many proteins do you in your single experiment actually can measure. And this is shown here again we have a time chart on the x-axis. And this is the number of proteins identified on the y-axis. You see there is of course a progress being made over the years that you can measure nowadays. Let's say routinely about 500 proteins in every experiment. But you can also see that as a quite substantial span and some studies people claim to have identified 2000. Where most recently you know some studies have only measured about 100. So it's really a matter of defining what you call a protein to be. And whether it should be identified in only one or in every sample of your measurement. And that brings me to the next sort of point is that if you look at the concentration distribution. So this is just basically here the ranked based on abundance. You see that you know predominantly sorry this is the rank of abundance. And this is the number of times we actually observe this protein in one of the 150 studies we looked at. Is there's a clear correlation between those proteins which are high abundant are seen more frequently than those proteins that are low abundant. Which also comes to the point that yeah now we can measure about two 500 proteins in every study. But really the question is how many times do we actually see this in every sample. I mean this is one thing to do with the concentration. But it also may have to do with the variants or the isoforms of particular proteins. It was quite interesting for me you know to see that for the four most most common or most frequent proteins albumin. I guess that's to be expected. Compliment factor four are from two Marco globin or Hapto globin. You did not have all the peptides seen in all studies. So which means that some peptides for albumin are in some studies so unique that they're not common and concordant with other studies. So again you know here we come to the point that there's much more information in the peptides that we actually currently so I think using. The next point we need to make is about quality. And I talked to some of you about you know the challenges of you know information that you actually observe. And Sanjeeva and I discussed you know the really having the important connection with the clinician obtaining a sample. So you have actually control or at least a better understanding what happened to the sample when it was taken. Because if you just think about blood I mean basically you can sort of dissect it into three elements. You have of course the cells you have some some micro particles or so lipid vesicles. And then you have something which we call sort of the cell free component which is sort of serum plasma. And given that there are lipids that of course are also important to be considered you know if you just look at the proteins. The reason why you have the proteins in plasma could either be that they're actually actively secreted into blood because of the process that is related to it. Or they can be cellular cellular sort of cellular origin meaning either they sort of been leaking out when the sample was being prepared. Or they've been shed from the surfaces because of the certain protease has basically took care of it. But there's also an important element which means that samples could be introduced into blood because of the preparation. So meaning you even have intracellular proteins which by changing the temperature from 37 to let's say 23 or 28 in India. You may just you know introduce some of the inflammatory cells to to secrete cytokines because of the change of environment. And that may have nothing to do with how the system sort of has been before. And so what we really advocate importantly is in particular when you do these large scale projects is that of course you want to know about the patient. We call it the donor it's more of general. Of course you want to know how old what gender what the diagnose at the sample collection was. But you also need to think about what the sample comes from. So when was it collected the location how old was how long was the sample being frozen has it been freeze thought and so forth. So it's really about the standardization of the procedures and obtaining information about the samples that you're analyzing. So that I think is really a key. And a lot of initiatives in Europe that you know trying to understand and trying to develop a pipeline for sample processing and handling samples. Because if you just do plain statistics without even sort of been thinking about proteomics. If you want to claim sort of a significant finding. And this is here just doing a simulation where we I think took one of the major risk factors of cardiovascular disease with an power of 0.05 80 percent and an alpha of 0.05. In order to just measure that one analyte you need to have about you know 80 samples per group. So it's 160 samples in total. If you think about a proteomics experiment let's say 1000 or 10,000. You need to have up to 250 300 sample per group. So many you're studying your measurement to be sort of relevant. If I may use that word when you have 600 or more samples. So this is not always possible. You know they are not that frequent so it's going to be extremely challenging to get up to that number. But of course you know to get really understanding about the diseases that is one way of moving forward. So there's a whole science behind sample preparation variables or pre analytical variables as we usually call it. Where people really try to understand what is the quality of sample. Tiazman's group has a recent paper on bioarchives where they sort of you know looked at where they basically try to you know separate plasma and did sort of different centrifugation segments and removed cells and re-spiked them into what is the contribution of cellular contamination in blood. So I think that's an important aspect because cellular contamination is a factor that can hardly be controlled. So what we mostly work with and this is what we're going to talk in the second part of my talk. But I have some more slides in between is how we're going to use different affinity based methods to study the plasma program. And I think this is predominantly driven by companies nowadays selling kids. It's a bit different to what the mass spectrometry field is doing where companies selling instruments and then this is academic environment that has to take care of them. So of course you have you know I think biogenesis or some other companies that sell or MRM proteomics that you know sell you kids. But I think there's no company that sells a kid for doing shotgun proteomics. So I guess you know it's a bit of a different ball game because you have you have a dependency on these on these companies. But the interesting thing about you know using affinity reagents in comparison to mass spec is that and here's a comparison about the proteins you can identify in mass spec as well as in immune asses. Is that you have a lot of the low abundant or annotated low abundant proteins that are actually measurable in immune assays compared to many structural elements which predominantly may originate from actually cells that are in your plasma that which you can measure in mass spec. Of course in mass spec most often and this is done purely on shotgun data is you of course take all information you get whereas in affinity based assays you preselect what you want to look for. So I mean these are really sort of conceptually different different approaches. And another aspect I mentioned before is really sort of the use of genetic data in combination with protein data. So how much information about your proteins is already given in your genome. Well of course we know that that is where the basic information lies. But how much of that information is actually being connected to what the proteins do at the end. I mean we have a whole machinery between the genetic and the proteomic information. But surprisingly and this is a study that for instance somalogic has been done. You know surprisingly there are a lot of indications that your genes or the variants of your genes tell a lot about the proteins that you measure in your sample in blood. So you know if you know somebody's genotype and if you know that genotype would be linked to a higher or lower risk of a certain disease. And you know that genotype is also linked to a so called PQGL so quantitative trait loci. Then you can say well that person always had a high risk of that particular disease. Always the protein level was low which was maybe you know and then a slight increase of a low protein level may actually mean much more than if you would have the inverse case where a person has a low risk but has intrinsically high level of a certain protein. So it's really important to include much much more data and I mean proteogenomics as one of the approaches but others I mean you know following the way so it's really. And this is actually one of the resources built by by. Collecting all this information so it will be I think a growing part in many of the proteomics study because if you don't know why a person has a higher level you know that might be one of the reasons for it. So I've been involved in a couple of mass spectrometry related projects. This is one that's led by Janne Lechjo's group. They have used this high reef system so they have basically isolectrically focusing as a pre-preparation concept. And then sort of did fractionation and what what we show in this study that we can you know detect 1000 proteins across all those 30 donors we've been using in this study. Interestingly we could also what what we have done is compare the baby newborn baby to the mother and we can see that there are many proteins are alike between the two but there are some which are only seen in the baby. Some are only seen in the mother and we also see that there are some proteins that basically traversely placenta so are being transferred from one location to another. And this could only be done using actually a proteogenomics approach where we know the variants that are sort of expressed by the baby versus the variants that are only expressed by the mother. So this I think will be a very important example and hopefully will be coming out in a couple of weeks time so we some resubmitting this as we speak. I think it will be a very important sort of element. What we also done and it's again using the protein atlas as a resource so the protein atlas had has of course produced these atlases. It has produced antibodies but it has also produced a lot of antigens. And I guess Peter Nielsen maybe some of you have met talks a lot about using the antigens for quality assurance of antibodies or using these antigens for autoimmunity profiling. What we're nowadays using is using these antigens as heavy standards for targeted mass spectrometry. And the reason is because we have these constructs so here you have the endogenous protein and then we select these unique regions we call breasts. And all these breasts by default carry a tag which we initially used for protein purification. But nowadays and this is basically a representation this is a fantastic tag to do quantification of that specific sort of standard. Because this is a common tag for all the standards we use in our system. And you can use this of course for all your mass spec retention time adjustments and so forth so what have you. And now we've done this for about 25,000 of these protein for these Q breasts as we call it. The paper is also on by archives and hopefully will be coming out in a couple of weeks time. But I mean you know using this as a pipeline that we can actually use that information to specifically build off the shelf targeted proteomics assays for the proteins we're interested in. And we've shown this for a couple of examples also in plasma now in this study that Frederick has been heading. So the main part of my talk is about sort of what do we do as sort of our core business if that would be sort of a pitch to the four investors. So the core is actually we use we do affinity based plasma proteomics. And that means that you know we use antibodies or different types of affinity agents if they are available for doing protein profiling. We care a lot about the study design. I think this is something I touched upon before. We care a lot about antibody validation. This has been a sort of a hot topic for us because antibodies have been criticized massively. There has always as in these constellations been some truth to it. But I think we believe that there are opportunities to change the perception and in parts that has to do is redefining or explaining more or less what the antibody is actually capable of. So an antibody is not an off the shelf universal tool that will solve all your problems. An antibody is something you need to know where to use it and for what to use it. So an antibody good for Western blood may not work and you use the chemistry or Eliza. That is an understanding that not many people have unfortunately. And so my lab currently runs three different sort of technology. So we use Luminex as sort of our go to platform because it's open access for us. We have all their equipment. We have 10 years of experience of using it in various aspects for protein as well as auto antibody profiling. It's for us sort of really easy to use. But it has some limitations in terms of sort of in the way we use it quantification and sensitivity. We also for about two years now have all ink as a technology that we run and we do this for different types of service projects or our own research project. We also have an interesting technology offered by ProteinSimple which uses microfluidics. And the nice thing with this system it's basically almost fully automated. So you don't have any user interference and that gives you a really excellent batch to batch precision. And it actually is a system that we think could be useful for clinicians because they don't want to think about how to run the experiment. They just want to get the data. We're also going to include Quanterix as a new technology, probably someone doing this year. And Quanterix is basically also a bead-based ELISA or a bead-based as a system just like Luminex. But they have a different way of read out in terms of that they use an enzyme to create chemo fluorescence. And they also have a different mode of detecting or counting their detection. Meaning that instead of measuring the sum of all signals that are sort of obtained they actually do, they call it digital counting. So they count the number of particles which actually emit a light at a certain level. And that's what they usually call sort of their sort of digital amplification range which is sort of then giving them a 100-fold improved sensitivity at the cost of using more samples and more antibodies which is not as well communicated. So of course then there's a whole portfolio of sort of options which technology to use. And this is usually when we have sort of meetings with users of the facility that I'm directing. What do you want to do? What is the specificity, the cost and the number of targets? Do you want quantification? How many samples do you have available? What are the volumes and so forth? So it's really sort of a ballpark of different features that you need to consider when you choose a certain method for your application. Which again is a bit of a different concept to MASPEC given that you can probably choose many systems for many applications. Of course you need to tweak them and some may be less suitable than others. But in theory I guess from all MASPEC you would get some data out. I mentioned antibody validation. This is really I think a key challenge that we are facing. But again what I've been preaching for some time now and this is sort of my tagline in this context is the performance and selectivity of antibodies application and context dependent. So it's the matter of the application meaning Western Blood or ELISA. It's the way that you prepare your sample, that you prepare your assay so that you actually get the best out of it. We've been working on using actually MASPEC as a readout for immuno-capture data. This is I guess most often done using cellular systems where you have actually an antibody towards a tag and the tag is fishing out a protein which has been introduced. And people are calling sort of the crap-oam, all the proteins you identified even though they don't have anything to say. So we've been putting all of this into plasma and this is a study led by Claudia. So here we've done more than 400 IPs and built sort of a library of data to judge whether an antibody specifically enriched a protein in plasma or not. And these are sort of these sort of enrichment plots. So to the left you have this crap-oam, the part that is commonly found in every enrichment and that may be due to proteins sticking to the beads. But then to the right-hand side we chose the Z-score of 3 as a cut-off. You see some on-target detection, you see some co-targets meaning proteins are co-enriched either because they have a similar sequence or they actually do interact which we find very interesting. We also see off-target interactions to proteins that are more abundant than the protein that we presume the antibody would bind to. And we actually also have cases where we have no target meaning there's no specific enrichment. So which I think in a way is interesting because either this could mean that if there's an enrichment that the target has not been sort of detected in mass spec or the protein that is simply too low abundant to be sort of reaching a Z-score that is of relevance. Yeah, as I mentioned protein interactions we find interesting. Here we know for this insulin growth factor binding family they sort of interact with another and as shown here in using the string database you have IGF-BP2 interacting with IGF-1 and 2. And as you can see here using three different antibodies we can see here is IGF that they actually interact. We also could claim new interactors with this BCHE as well as DERA as proteins that are relevant for these complexes. What we also do and sort of going back to our sort of most accessible technology is using Luminex and here RACNA has screened more than 200 antibodies. So 200 proteins using more than 600 antibodies to find which are actually suitable sandwich pairs so using both for capture and detection. This is also now a paper which you can find on bio archives and hopefully the reviewers will like it. So we've done sort of a long-term procedure doing two screening rounds and this is substantial amount of work with a couple of people involved. But at the end of the day it led us to this triangular chart here where we actually looked at longitudinal samples and the precision the assay provides in this context. So it's a bit of difficult to read but basically we looked at sort of what is the variance of the assay in terms of the technical position, what is the difference between the individuals that we observe over time and what is the difference between the individuals themselves. And as you can see we have a couple of nice proteins here those ones that we highlight in green which are those proteins that we can measure precisely that are stable over time but they vary a lot between the individuals meaning that there's probably a genetic component or some sort of personalized component to it that makes these proteins more interesting than others. Again we took a lot of effort to do validation and here's some sort of correlation chart where we compare the different sandwich assay data here in this case for a protein called CCL16. We have three different assays we developed in-house and this is the assay offered by O-Link so we have a pretty good precision in using completely different, so this is O-Link is a solution phase protein proximity extension assay whereas we have a classical ELISA where you capture on a bead, you wash and then you add your detection antibody. So our main workhorse has for many years been these antibody bead arrays so we really use the high multiplexing capacity of the Luminex system. We have a couple of liquid handling devices to do upfront sample prep. So here the idea is that instead of using two antibodies you basically label your sample like you do in different types of TMT assays and then you sort of have a bead array which has 384 different antibodies. You fish out the proteins that you can find in the solution and then you use biotin to detect whether the antibody has enriched the protein of interest. We've been working a lot on getting the data analysis and processing right and it's done by mainly Moon Guan who's a researcher in my group and we have a pretty good idea about the data, the precision and sort of the accuracy and these are sort of two T-SNE plots showing basically the same data but here basically these are the replicates where all the other data points are samples taken from the same individuals every third month over a period of one year and you can also maybe see that all these individuals actually cluster together. So using our data we know the phenotype or the plasma proteome phenotype that we measure is constant over time using our data. So we've been involved in a couple of larger and I think actually growingly larger projects which we try to have multiple study sets meaning samples coming from more than one location. Using more antibodies actually building our own sandwich immune assays. For those candidates we identify to use immunocaptor mass spectrometry as a way to validate. We still have Western blood sometimes as a go-to option but it's actually less relevant nowadays for our approaches. We do validation of antibodies using peptide or protein arrays. Sometimes this is helpful to certify the selectivity between different off-target candidates and more I think interestingly for us in the future will be to do this PQTL sort of the GWAS analysis to understand what is the genetic component behind these studies that we're performing. So these are two of these initiatives. One is the wellness project which is headed by Matthias and Joran Bergstrom where we've taken those hundred subjects, did all the different omics and clinical measurements where we looked at them every third month over the course of one year. I'm also part of a very large EU project with different pharma companies and clinicians from all over Europe to do basically the same or a similar type of molecular clinical and environmental phenotyping in the context of pre-diabetes and diabetes progression. And again an important aspect for us are these four elements. So it's a study design, how do we sort of proceed in terms of randomization? How do we get the number of samples right? How do we do the discovery? I mean we have to choose which are the interesting candidates because we make that pre-selection. How do we do antibody validation or actually building new assets for target validation and then to go back and sort of study new samples again to prove that our hypothesis is actually valid. So one of the studies we've done here on biotic surgery, so biotic surgery is a major type of intervention for people that are very obese and that usually are at high risk of diabetes. So the idea is that this surgery induces weight loss and along with weight loss comes that the patient is not longer defined as being diabetic. And we want to understand whether proteins in plasma can give us an indication about either will a patient be benefiting from that surgery? That sort of we're looking at remission which is done using a multi-omics approach by one of the post toxin my group, but also how do proteins change over time pre and post surgery. So because we have looked at the patients at baseline as well as following surgery. And we actually could see that there are a couple of proteins that are consistently increasing knowing that there's an individual variance consistently increasing post surgery and we looked at three months as a time window because between zero and the three months there's a lot of processes that are sort of overruling sort of the phenotypes we are interested in, in particular those that are related to wound healing. So if you measure a patient day after the surgery, a lot of the things you measure is actually the patient responding to the surgery, responding sort of on a metabolic level. And we could interesting to see some proteins actually do change also in the opposite direction which means that they are actually decreased in abundance. Another type of multi-omics approach we've done in the context of unstable atherosclerosis, so here you have basically the coronary plugs that some people develop and of course there's a risk that some are unstable, some are unstable which means that you might actually have a higher risk of stroke and heart attack. So with a group of clinicians who've done sort of microarray and QPCR identified a couple of candidates which they could validate in using mass spec in tissue we then took on this type of target and actually could validate the same sort of observation using either the suspension bideray, the screening approach as well as we build a sandwich assay to measure that same difference in plasma samples. So we really sort of brought from sort of early DNA, RNA detection down to sort of applications in plasma. We also did a large-scale study on mammographic density. So this is a study again we're sort of switching a bit sort of disease areas here is related to cancer and in particular is a risk factor for women in the western world. So if you lose density in your breast postmenopause it's actually a very good protective indication but if the stiffness of the breast stays after menopause there's a higher risk of breast developing breast cancer but nobody understands what is this density and we tried to find using association study on a cohort of about 1200 women whether we could identify features that are consistent and we found a couple of interesting proteins related to the exocellular matrix as well as two proliferation levels that could indicate that there's actually a loss and increase in density visible in the plasma proteome. Again sort of one aspect that we have been working on frequently is this longitudinal profiling and here I again want to bring up something I mentioned earlier which is sort of how consistent can you actually measure a protein. So here we've looked at basically these we correlated the data we generated for this protein across these four visits and we see that you know the protein is pretty the measurement is pretty stable over time but then if we look if you compare the data between the visits you can see okay here we have extremely high precision as well so meaning that protein can be accurately measured and it's very stable over time. The second protein here is again we sort of replicate this as a couple of times and you can see the precision of the measurement is very good but if you look at the correlation of the biological variation as we call it where you compare those data measured at visit two versus visit one and so forth there's basically zero correlation which means each blood collection introduces a factor which cannot be replicated and of course if you have a biomarker which looks like this on a technical scale but if it's impossible to replicate because it is and this protein we know is part of the smooth muscle system so we know it's actually coming from puncturing your skin and the vein but again you know it's the protein you would not have sort of considered but you can actually measure it. We also looked at seasonal fluctuations so this is of course interesting you know in a sense that what are the differences if you measure your protein during winter compared to during summer and may just be the seasonal have an effect on your protein levels and just assuming let's say this would be sort of a cut-off level for this protein here you know here you would be actually above cut-off and the doctor may say well we may need to check up on you a bit more while as during summer you know you actually have a much lower level so of course these parameters which also I mean relate back to the time point and the age of the sample are important things you need to consider when you do your measurements. I'm going to finish with this project so this is also something which hopefully will be coming out in a couple of weeks time we've received some very good comments from the first round of review that will be certainly able to manage to handle this so what we've done here is basically this is a study this is a bit of a hobby study actually because in most of the projects we have we know the age of the person which donates the sample so we just started to collect a couple of studies and actually now it ended up to be it's actually 4000 in total where we actually looked at the same protein over and over again and as you can see here in all these studies the slopes may be different but in all these studies we could see a consistent increase in trend over time so basically using this as an additional passenger in the different studies we sort of as a byproduct more or less found a protein which is associated with age and we've been validating this and this is the meta analysis that we did so the p-value is I think far better than most studies you've seen because it's really consistent across time and what we also looked at is what it's one thing that this protein HRG tells you about your age but it also tells you much more about the risk of dying so it seems that elevated levels of this protein increases your risk of dying compared to low levels specifically up to eight and a half years prior to death and this is sort of on an all-course mortality so it's not linked to any particular cause of death like cancer or cardiovascular disease of course we did a lot of validation of the antibodies we actually also did GWAS and this is actually the protein array that we've run with Peter's group so these are 20,000 spots and you can only see a single peak of this antibody which was quite surprising but we know we can measure HRG but then what we actually also did and this is fairly new for us is again we took this genetic information we had about these individuals and there was another antibody against the same target and when we correlated the slope so meaning what you do in this peak at ELSA is you have a box plot with three different groups so it's the AA, TA or AT genotype and then you just superimpose sort of a trend and what we could see that these two antibodies they have the exactly same list of peak at ELSA they have an opposite trend in their association and that's also seen here by these distribution plots meaning the red genotype is lower for this antibody whereas the red genotype is higher for this antibody so it's completely new data nobody has done something like this before but what it says we have not really fully understood but what we believe right now is that every person has a particular variant of that protein and that the antibodies have a particular affinity to that protein variant so we think and it's likely that many proteins we nowadays study they don't actually in reality differ in concentration they just differ in the variant they are and that the different methodologies may be mass spec or affinity based ethics just think or just report different signals because it's a different variant and I guess it's a particular challenge for both acetypes in mass spec because when you look at the libraries that you use to match your data this is done on canonical sequences of course you can do proteogenomics approaches but that's not always possible but it will be in the future because you need to have that understanding to know what to look after and it's the same thing for affinity acids if a small variant if you have an exchange let's say you would change a hydrophilic amino acid and you have a non-synonymous mutation meaning that that suddenly becomes from serine you change to a proline you will change the behavior of that protein either in the way it's been recognized in your test or how it actually interacts with other proteins and thereby may be more accessible for let's say different types of measurements I'd like to thank you for your attention and of course all these people in summary today you have studied the human plasma proteome using affinity based methods which could enhance biomarker discovery validation and integration from basic research towards clinical usage how at least antibody from HPA project can help us in getting detailed understanding and background of the affinity based methods Dr. Joakim also provided a brief understanding about GWAS and how patient information is important to understand data set variations in the next lecture we will listen a clinician Dr. Sachin Jadho who will talk to us about clinical considerations for ohmic studies thank you