 Today Professor Josh Rebair will discuss about biomarkers in more detail and also different phases of biomarker detection. Till then continue talking about the importance of statistics for the biomarker discovery program. Usually biomarker discovery programs are very challenging because to have a real biomarker which could work globally, one need to do large number of samples analysis. One need to do many ways of data analysis to ensure that a given protein or a given candidate biomolecule could really cater the needs of detection or the therapeutic significance in the clinics. So biomarker discovery programs usually depends on a big team which involves clinicians, technologists, statisticians and many people who are together trying to make meaningful and reproducible data and a sense out of these experiments. I hope today's lecture will give you more insight and integrity detail about how to do biomarker discovery based research. So let us welcome Dr. Josh Rebair for today's lecture. So now we are going to talk a little bit more specifically about biomarkers. So I won't go through this part a lot because we kind of did this. So we talked about why you would do it. You want to monitor disease. You might want to monitor whether therapeutics are working properly. You might be able to use the markers to predict toxicity of drugs or efficacy of drugs. We talked about the use of a screen for disease to early detection or even acute diagnosis. Patient shows up in the hospital with crushing sub-sternal pressure in their chest and you want to know is this patient, do they just eat some bad food or do they actually have an ongoing heart attack? And a blood test would be very useful in that setting and there are a couple of blood tests but they're still not fast enough. You might need a test to look for infectious disease. Yesterday if you went to the symposium you heard about the need for a blood test for tuberculosis. This is an illness that infects a third of the population on our planet and it's one of the top ten killers of all people and yet it's very difficult to diagnose. And then to personalize treatment of therapy, again biomarkers may be helpful for that. So all of these are reasons why you'd want biomarkers. So this is sort of a different way of saying what I've told you earlier. This is based on a publication from the Early Detection Research Network at the National Cancer Institute in the U.S. This basically outlines if you're going to develop an early detection marker the phases that you should go through. First you should do exploratory studies. This is the kind of observed difference study I told you about earlier. Then you need to do a clinical assay and validation so you need to establish that the assay can detect the disease. Then they would say do a retrospective longitudinal study so you look, these may be old samples but you're looking at samples collected over a period of time to ask, you know, does the marker change when the patient goes from no disease to disease? So that's phase three. Phase four would be to do a prospective study, we talked about that earlier, collect samples going forward starting today and asking does the marker actually identify those people who are ill? And then cancer control would be to implement the use of that marker in a large scale screening population. So I'm going to walk through about six or seven rules for biomarkers and then let's see if we can understand them all. So the first goal, and I told you about this earlier, is to define your goal clearly. So what is it that you want to do? Why are you making a marker? What do you hope that it will help you accomplish? So let's diverge now and talk a little bit about the statistics of biomarkers. Okay so this is obvious, right? You have a population of people, some of the people have the disease and some people don't, right? That's true of any population anywhere. You got some people in that population that have it and some people that don't, right? For the moment now let's assume that this is absolute truth. This is truth with Roman characters. This is the absolute answer. And then now we also have a test. This is our biomarker right here and our test is designed to predict these two features. The test can either have a positive result or it can have a negative result. Basically we want the positive result to tell us when the disease is present and the negative result to tell us when the disease is absent. But as you know nothing is ever perfect. So let's look at the possible cases. The first mathematical thing we know is that A plus B plus C plus D are all the people in the study population. So this box here is everybody in our study. So first we've got this group over here. So we call those the true positives. True positives means the test was positive and they actually had the disease. So the test got it right. That's as it should be, right? Okay the second group is this group down here and those are what we would call the true negatives. In this case the test was negative and these people also did not have the disease. So once again the test was correct. So this box here and that box there that's when the test is working well. It does what it's supposed to do, right? So that group of people is A, this group of people is D. Okay so what about this? That's a false positive, right? What's a false positive? Okay I've got a lot of answers over here. Okay yeah but the test is positive, right? The test says they have it but they don't really have it. So why do we care? Why do we care? Is it bad to be false positive? So they might get inappropriate treatment? What else? Right, right so you're going to particularly if it's a disease like cancer there's a lot of emotional anguish to thinking that you're a cancer patient when you don't really have cancer, right? Right and then in some cases it's also you put them through needless testing to see if they have a disease and that can be either or both expensive and tiring for patients, right? So the consequences of false positives are as you all point out emotional angst, expensive testing and it reduces the success of a treatment regimen. This has to do with when you're actually testing your drugs. If the marker said that they had the disease but they didn't have the disease and your drug won't cure those people and so you'll get inappropriate results. Okay and then this group down here we call those false negatives. The test was negative but in fact they really have the disease. So what's the consequence of a false negative? Right, you miss the disease. The patient is ill, you told them you know what you're perfectly healthy. Go about your life, don't worry about it and then six months later they have the disease. Right, so this is the misdiagnosis. It's a missed opportunity for intervention. It is by far the most common cause for malpractice lawsuits in the U.S. The misdiagnosis of cancer is the biggest cause of huge lawsuits in the U.S. And so you don't want to be wrong about this. The consequences of a false negative are big. So rule number two of biomarkers is understand the consequences of being wrong. You need to know why it's important to have a good biomarker. Okay, so now how do we calculate the probability of disease? Well you take, right, so this is the disease and these are the people that have the disease. So what's the probability of disease mathematically here? Right, right, so A and C divided by everybody, right? So that is the probability of disease. So in your population this will tell you how often the disease occurs. Okay, now the next thing we want to talk about is sensitivity. Okay, sensitivity we define as a positive test in the presence of disease. That's sensitivity. And in this case mathematically it's A over A plus C. So you're saying the denominator is everybody with disease and A is just the people who the test were positive for. The closer that A is to A plus C, right? That means the smaller the negative, the false negatives, the better the test, right? So that's called sensitivity. Finding disease when it is present. I make all my students memorize this because people often forget this stuff. So this is a good measurement of how good the test is at finding it when it's there. Okay, specificity is something different. Specificity is ruling out the disease when it's not present, okay? So, that's sensitivity. Oh yeah, okay, yeah, so this is specificity. We're looking at the false, we're taking the people who are truly negative, divided by all the people who are negative. So how well can you count on the test to be negative when in fact there is no disease, right? In other words, how low are the false positives, right? And so we measure it by D over B plus D and that's the equation here. So it's ruling out disease when it's absent, okay? Let's do a little quiz question. If you're gonna design a test to be screen for cancer, which is more important. Sensitivity or specificity, I'm hearing vaguely sensitivity, right? That, and why is that? Well, I just told you that the biggest cause of malpractice lawsuits is the misdiagnosis of cancer. You don't wanna be wrong if you tell someone that they're cancer-free and they're not cancer-free. So in the case of cancer detection, sensitivity is probably the most important thing. You're willing to tolerate some false positives, if you have to, to make sure that you don't miss anybody. Okay, now let's talk about a different circumstance. Imagine someone going to a doctor, they're coughing up blood, they have weight loss, they have night sweats, right? And the doctor appropriately suspects that they might have tuberculosis, right? Those would be common symptoms. So which is more important here? Sensitivity or specificity? Hey, why? Raise your hand so I know who to, okay, right, for the TV. Yeah, I mean, the point is that sensitivity isn't an issue here because the patient's right there in front of you. You already know this person's sick. That's not the question anymore. The sickness is already a given. What you wanna know is, is it TB or not, right? You already suspect it's TB, and here what you're relying on is the test to be very specific to say yes, it really is TB and not some other, you know, some other illness. Okay, so now I'm gonna show you a little bit about, it turns out sensitivity and specificity in many cases work against each other. Because typically what happens is you have a test for a particular molecule or a typical biomarker. You set a threshold value, and you say if it's above this value, I'm gonna say it's positive. If it's below this value, I'm gonna say it's negative, right? And the challenge is that as you elevate or decrease that number, you will alter both the sensitivity and the specificity, and oftentimes in opposing ways. So I will tell you right now that these are data for a test for diabetes, and the idea behind this test was that they were gonna measure blood sugar after a meal. It turns out this is a bad test for diabetes, and no one uses it, okay? You'll see why in a minute. But it is a useful test to look at this because it does illustrate the concept a little bit, okay? So these are the blood sugars after eating a meal ranging from 70 milligrams per deciliter up to 200 milligrams per deciliter. And here, if you use this value as the cutoff, in other words, if you say that if you're above 100, you have diabetes, then this will be your sensitivity, and that will be your specificity, okay? So let's look at this example here. So at 80, if you use 80 as your cutoff, you're gonna be 97% sensitive, right? But you're gonna be only 25% specific. So one goes up, the other goes down. So what that means is that you're gonna identify 97% of the actual diabetics. The test will be positive in the presence of disease 97% of the time. But almost three quarters of the people that test is disease free will also have diabetes. So you won't be very specific. I mean, we'll also test positive. So three quarters of people who have no disease will test as if they had to be a huge amount of false positives, right? Okay, by comparison, let's say, well, okay, that was too lenient. Let's, that allowed too many people in. Let's set a more strict number. Let's say it's 160, all right? So now the sensitivity is 47%, but the specificity is 99%. Okay, so what that means is that if you make a negative call, if you say that they don't have diabetes, you're gonna be almost always right. 99% of the time, you're gonna be correct. But you're gonna miss half the diabetics. You're gonna miss, so you're gonna have a lot of false negatives. And so that's just to show you that sensitivity and specificity often work against each other. Of course, sensitivity and specificity are both values that specifically refer to the test itself. That, when you go to the doctor, that's not what you care about. You don't care how good the test is, what do you care about? What's happening to me? Tell me about me, I don't wanna know about your test. I wanna know what, how am I doing, right? And so there are two statistical terms we use to describe what's happening to me, all right? The first one is the positive predictive value. Okay, so what do I mean by the positive predictive value? The positive predictive value is if the test is positive, what's the chance that I have the disease? So the test says I have it, do I really have it, right? And so to mathematically calculate that, that's shown here. It's basically taking all the people who actually have the disease, divided by all the people who are tested as having the disease. And that is the predictive value of the positive test, right? And that matters a lot to patients. Sometimes this other value matters even more. This is what we call the negative predictive value. So it, you had a test, we did a test for, for cancer or we did a test for birth defects in your child. How sure are we that you don't have cancer or you don't have, your child doesn't have birth defects, right? So what, how confident is a negative value in telling you that you are disease free? And that is defined as taking all the people who are truly negative, divided by all the people who are tested as negative. So positive predictive value and negative predictive value, this is what doctors care about, this is what patients care about. What's happening to me? How am I doing? Okay, so we're ready to do a little quiz now. Okay, so this is a quiz for in this case, a test called Reward Amnesia. The disease is occurs in one in a thousand people. It's a pretty common disease. The sensitivity of our test is 99% and the specificity is 95%, okay? We test a random individual for the disease. What's the chance that he actually has the disease? Okay, got it. Sensitivity 95, 99, specificity, specificity is, is 95. So how many people think that there's an 80 to 90% chance that he has the disease? Okay, got one of those. How many people think it's 60 to 80% chance that he actually has the disease? How about 40 to 60? I'm going to assume you got it all wrong if you didn't get it. 20 to 40%, I got one 20 for 40. So far, okay, how about 10 to 20? How about 0 to 10? Got a few of those? The rest of you all think that it's 90 to 100. How many think it's 90 to 100? Okay, got a few 90 to 100s. All right, all right, it's about 2%. Yeah, it's about 2%, right? Because remember, what affects you here is the incidence of the disease. It's very low and that is, it turns out that this is an important thing to remember about these statistics. And let me go back a second and point that out. Remember that sensitivity and specificity were down in these columns here, right? Those terms do not depend on the population. It doesn't matter how often the disease occurs for them. They strictly measure the value of the test on whatever specific population they're being tested on. But positive predictive value and negative predictive value, they depend on how often the disease occurs. And I'm gonna walk you through that in a minute, but it's really important to remember that. When you hear somebody boast about the positive predictive value of a test, the first thing you need to ask was, what population did you test? How prevalent was the disease in that population? Okay, so let's walk through that. So this has to do with what's called Bayesian calculations, which includes looking not only at the probability, but also at what's called the prior probability. Which is when you begin your test, what was the likelihood to start with? And we're gonna use as an example the prostate specific antigen test. It's a very common test used to detect prostate cancer. It has a sensitivity of around 70% and a specificity of 90%. That's one of the best values you'll see anywhere. That's a pretty typical marker. When people, when I told you before, but people publish 99% and 99%, you don't believe it. Numbers like this, that's kind of what you'd expect from a pretty good marker. So now we're gonna ask the question, how does incidence or prevalence affect the positive predictive value of a test? We're gonna consider three different populations. We're gonna consider all men, in which case the incidence of prostate cancer is 35 cases in 100,000. We're gonna consider men who are over 75. In which case the prevalence of the disease goes up to 500 per 100,000. And then we're gonna consider men who already have a clinically suspicious nodule. A doctor did an exam and found a mass. So that in that case, there's about a 50% chance that they have cancer. Okay, so three different populations, these are the incidents. Remember I told you the probability of disease, A plus C over A plus, C plus D, that's what these numbers are, right here. Okay, so let's look at the first case. In this case, we're looking at the clinical nodule, a 50% likelihood to start that this person has cancer, right? So notice that I have that this number here and that number there add up to 50,000, right? So 50,000, remember I said that out of 100,000 men, 50,000 had it. So 50,000 have it and 50,000 don't. So that's appropriate, right? Remember I said that it has a 70% sensitivity. So that means of this number here, 70% or 35,000 are positive. And remember I said that it had a 90% specificity. So of this number, 45,000 don't have it, right? So these numbers all add up to these numbers here. And believe me, so now do the math. If you do the math, the positive predictive value is 88%. So even though you already have a suspected mass, and even though this test has a 70% specificity and a 95% sensitivity, the predictive value is still not 100%, it's still about 88%. Now let's look at a very different population. We'll go to the other end of the spectrum. Let's look for men who, all men, 35 men in 100,000 that have a disease. So now let's do the math again. The population that has a disease is 35. The population that doesn't is everybody else, right? Out of 100,000, still we have a 70% specific sensitivity here. And here we still have a 90% specificity. Look at how good that test is. 0.2%, so the take-home message here is that, depending on the population, the positive predictive value changes dramatically. We didn't change these numbers at all. Those numbers stayed the same throughout the whole discussion. The only thing that we changed was how often the disease occurs. And if the disease is rare, then the predictive value of the test drops quite a bit. This is one of the reasons why, at least in the US, we don't recommend that young men do treadmill tests for heart disease. Because the treadmill test was designed for older men, where it has good predictive value. But when the incidence of the disease drops like it does here, then the predictive value drops precipitously. And then the risk of a false positive becomes much higher. And then this is just to show you the more general circumstance of 500 and 100,000. So this is not far from the 1 in 1,000 we looked at in that quiz question. And again, here the test is around 3.4%. So it all has to do with the population you're dealing with. Okay, so I sat through a lecture in my institute where someone was boasting about his test that he had developed. And this is the clinical study he did, 450 cases and 150 controls. So the prevalence in this population is what? So the prevalence is very high, right? Because your three quarters of the people in your study have the disease. Three quarters of them have it, right? So he had this positive test and he said that his predictive value was 75%, right? And I looked at the numbers here and it turns out that if he had zero, if the tests were equally split between positive and negative, right? He would have, like half the time it's positive and half the time it's negative, he would have still had a predictive value of 75%. So he had to do nothing. The test had zero predictive value in a sense, and it would still have given him a positive predictive value of 75%. So it's pretty lame presentation. Okay, so rule number three, choose your population carefully, right? All right, so if you're gonna do an early biomarker study, then make sure you pick people who have early stage disease. Because that's when you want to get the disease. Will it apply, if the test will apply to people with different stages of disease, if it could be confounded by people with different diseases, maybe they have other things that could alter their CEA levels or have nonmalignant GI disease. And just remember that sometimes it's more important to separate disease A from B than disease A from normal. So imagine if you're in a clinic and someone walks into your clinic and they have abdominal pain. They tell you that they've had abdominal pain for months and they've been losing weight, right? In that case, you're not necessarily interested in distinguishing colon cancer from healthy people. You might be more interested in distinguishing colon cancer from inflammatory bowel disease. You know the patient's ill. They've been suffering from GI symptoms for months. So you know there's something wrong. You're not separating normal from cancer. You're separating cancer from other GI diseases. And so always remember that if you're gonna do a study to find a biomarker, you should use a population of maybe people with non-cancer GI diseases from cancer diseases. You need to make sure you don't extrapolate inappropriately. If you develop a test that's good in one population, it might not work in another population if, for example, their kidneys don't work as well in older people. If it's something that's excreted by the kidneys, the test may work in a 20-year-old, it may not work in a 60-year-old. Diseases on stomach cancer, for example, don't extrapolate to the USA. The risk factors for stomach cancer are much higher there. That population is different. And of course, patients in the hospital are different from healthy people. Okay, and then this is something that we talked about a little bit earlier already. This is what I call the fallacy of statistical significance. And so we kind of covered that already just because there's a good p-value between A and B doesn't mean that they're good biomarkers. You should be using sensitivity and specificity, not p-values. And that's really shown on this thing here, which we've already covered, so I'm going to skip that. All right, so focus on sensitive and specificity markers and not on statistical significance. All right, that's fair. So now I want to mention a little bit the, and we're coming to the end here, the omics trap. Because all of you are, many of you are going to be doing omics studies. That's what we all do these days. And you often hear this statement from people in the omics studies. I'm not going to look for a biomarker, I'm going to look for a pattern. I'm going to look for a signature. They might be doing it on DNA microarrays or protein arrays. But you have to remember that a pattern is really multiple parallel tests. They're doing a bunch of different molecular and statistical studies. And by doing multiple tests, they increase your sensitivity because each test has a chance of being positive. But they reduce your specificity because you have a higher now rate of false positives, right? So if you're going to do multiple parallel tests or look for patterns, my biggest advice is to get a statistician. Because you're going to need more careful statistics. And this class is not prepared, we're not going to do those statistics here. You just need to be aware that when you get to that stage, it's time to engage somebody. So we have two tests. Imagine that this test, they have two tests for the same illness. And they're testing for a positive. And they both have a positive predictive value of 95%. So imagine test A is positive and it has a probability that it's going to be positive is 5%. Test B might be positive, so it's chance to, chance alone is 5%. If you do both A and B, now, if you require them both to be positive, now your test is getting more stringent because the chance of a false positive is much lower now. But if you accept either one, now the chance is much higher because you now have to add the two effects together. Okay, so this is even, imagine if you do this with multiple tests. So now you have a whole series of tests. I'm going to just go. So now each of these is going to have a different positive predictive value. They're going to have all kinds of different due to random chances. And if you add them all up, the numbers get to be outrageous. So again, the take home message is get a biostatistician. So here's the example that I like to remind people of when they're doing multiple testing. And this is a lot like what you would see in an ohmic study. So if I asked every one of you to take out a coin and flip it and mark down whether you got a heads or a tails, what do you think the likelihood is that the result on the coin would predict the gender of the individual who flipped the coin? Right? Nothing, right? Okay, now let me change that. Let's say that I gave you each 10,000 coins to flip. And you went one by one and flipped every coin and you marked down heads or tails. What's the chance that among those 10,000 flips that one of them, maybe the 5,635th of them would correlate with sex of the individual? There's a chance, right? Might not be perfect, but among those 10,000 tries, maybe one of them by chance alone would align, maybe not perfectly, but it would align with the gender of the individual. And you would say, aha, I found a biomarker, if you flip a coin 5,635 times, that one will predict the sex of the individual, but you'd be wrong. So how would you prove that you'd be wrong? You repeat the study. You do it a second time, 10,000, right? And now the 5,630 doesn't work anymore. Now it's the 123rd, right? It's just random chance, some of them will happen to work. And that is what we do with ohmic studies. We test 10,000 things. We get one that works and we say, aha, I found a biomarker. But you tried 10,000 times. So you have to adjust for that by doing some kind of false discovery adjustment. So that's kind of what I did here. So imagine if people did all these studies, right? You have to keep the number of the population small. This is especially a problem when the size of your population is small relative to the number of variables you're trying. If you have a study of 100 individuals, 50 cases and 50 controls, but you're testing 10,000 variables, you have this risk of what's called overfitting. And then that's why if you repeat the study doing a completely different population. All right, so I kind of went through this. All right. So if your biomarker is a proteomic or expression pattern, the bottom line is get a good statistician. OK, the last couple of things I'm going to mention is where do you get your samples? Make sure that the sample that you use is relevant for the use of the test. So imagine if you're going to do a biomarker on early disease detection, right? We said that you're really going to be testing a healthy population. Healthy people are not going to be interested in giving you biopsies, nor would it be appropriate to put them through that risk, right? If you're going to take a test for healthy people, it should be a simple test. You're in maybe blood. It's got to be something that you can measure easily, maybe saliva. You can't rely on doing biopsies. On the other hand, if they already have cancer, then of course they might be willing to do that. You have to look at whether the sample will be stable. If it's a biomarker in blood, will it be stable in blood? Remember I told you about Paul Temst in the study where he could tell the difference between the tubes? Well, what it turns out is that one of the tube types was inhibiting a protease, and the other one was not. And what was causing the difference was proteolysis in the sample. So in that case, the material was not stable. So you need to know that what you're measuring is stable in your samples. You need to know if it changes in body states. If that molecule goes up and down after a meal, if it goes up and down with a sleep cycle, again, that's something that you have to consider. And then of course, if you're measuring samples from a tumor, you need to look at where you're taking your biopsy from. So rule number six is the willingness of an individual to partner with some of his liver is directly proportional to the gravity of his diagnosis. People do not give up parts of themselves easily. They only do so when they're really sick. So it's good to remember that. OK, part of biomarkers is knowing how to prepare your samples. How are you going to preserve it? That could dramatically affect outcome. I already gave you the example with Paul Temst. Is the instrument robust and reliable? Is it going to give you the same answer every time you measure it? Is the chemistry robust? Well, if you ship this sample to a hospital far away, will they get the same answer that your hospital gets here? And then of course, what controls are you going to use? So these are just some of the general things to think about. So will sample preparation affect the reading? Are you handling the samples properly? Are you going to freeze them? And then of course, you need to know if there are natural variations of the biomarker you're testing from person to person. Because if there's a lot of natural variation even among normals, that's going to make it more difficult to use that as a biomarker. And then there's this question of abundance of the biomarker. Is there enough of it in the sample that you can measure it? Is it likely to, will you be able to detect it when you want to detect it? So in the case of early biomarkers, early detection biomarkers, is there going to be enough there in an early specimen from people with early disease that you can actually detect it? So the marker may be very good at picking up cancer, but it may be too weak to be able to pick it in early disease. That was the case with the CA125 that I mentioned earlier. It was a good marker for distinguishing ovarian cancer. It's just not abundant enough in early disease to pick it up. Developing a robust, reliable test is half the game. Just because you found a molecule that looks good doesn't mean that you've got a biomarker. What you need now is to develop it into an actual diagnostic test. And then the last thing I'm going to mention is this one, which is that your markers are likely to be more believable if they relate to the biology of the disease. And I think a couple of you have already mentioned that. But just keep that in mind that if you want the marker to make sense, look for markers that fit with what you think is going on the disease. If it's sort of a random molecule, it'll be a lot harder to validate it. So I will stop there. All right, so by now you are quite familiar with the importance of studying biomarkers. And you've also seen the challenges of performing the experiment to discover new biomarkers. And I must say that you notice very, very challenging journey. And that's why actual clinical translation of biomarkers is not easy and not very successful either. There are a lot of candidate proteins have been discovered which have potential for the biomarkers, especially for early detection of disease or prognostic values or even therapeutic values. However, many of the biomarkers are not easily translable to the clinics. The reason that you need to do a lot of validation to ensure that from the discovery work what the biomarkers have been identified, they really fit the purpose of the clinical assays. And they are able to serve the utility for the large patient populations. Therefore, the biomarker discovery program, even if it is performed on the small number of samples, you need to now scale up to the really large number of samples to do the validation that these proteins are actually showing the kind of expression pattern which you have discovered from the initial workflow. You've also learned the need to have a good team involving clinicians who can give you the right samples to test your hypothesis, the right type of technology platforms where you can execute these experiments and then involve the scientists who are good in doing the big data analysis who can now make reproducible and sense of your data without compromising on the data quality. So these are the considerations which are very crucial and I must say that despite all the odds, despite all the challenges, these are the many biomarkers which are now getting translated to the clinics, they're getting approved by the US FDA and there are some such stories, especially the OVA 1 and OVA 4 and so the other protein which are now coming to the markets giving you the motivation that if you do these kind of discovery workflow properly, probably eventually it may be translatable to the clinics. Thank you very much.