 In dit lectuur wil ik u oorwoordelig oorwoordelig oorwoordelig oorwoordelig en speciële en positie en negatief verhouding van die waardes. Die are ongelijk belangrijk en het heel belangrijk van het klinkende punt van het u, waarin u oorwoordelig oorwoordelig oorwoordelig oorwoordelig oorwoordelig oorwoordelig oorwoordelig of waarin u het testverzal in jouw hand om wat die aakroosie van dat is. Nu, er is a lot van dit ipython notebook en ek ga jou door die vier toppies die we mentioned, maar op de einde zie je dat we genoe gelijkte positieadpredactieval, te denkste, ekumulatief risc en prevelins, die verschil in rheidse en absolute verschillende risc en risc, rheidse en onze ratios. Ik denk dat die er bened, maar beyond die scrupe van een interduktie-kurs, maar ek ga het daarnaal leave, in de notebook, zodat you can read through those very powerful information really, that you get from those calculations. Let's look at the interesting world of sensitivity, specificity and then positive and negative predictive values. As per usual, we just going to import our style sheet there, not going to use much coding in this lecture, we see we just importing pandas as pd and then my normal filter warnings. I'm not even going to plot anything, so let's go. So sensitivity en specificity. Now this you've got to think about sensitivity and specificity in the view of how you can use special investigations or even clinical findings or findings in the lab to look at how accurate things are. Now I'm going to use an example here with which we're going to calculate before we get to our mock data. But let's just think about things in this way. Imagine then I take a thousand volunteers and I divide them in two groups. A thousand human volunteers, animal volunteers, a thousand lab specimens, doesn't matter. And I've got two groups. Now I'm going to use clinical data here. So let's say that some of them have a disease and in some that disease is absent. Think about it this way though. I need a test that I can be absolutely 100% confident in that that test really shows something is there or something is not there. As a clinical example we might have, although it's not 100% accurate to use this example, but imagine I could remove some tissue and send it for analysis and the tissue comes back with the presence of a disease or the absence of that disease. Ok, so I know this for sure. There's an absolute imagine thought experiment. There's an absolute magical way of knowing whether a patient is in one group or the other group. Now I'm now going to introduce a new test. Now that I know behind the scenes what the true situation is, I introduce a new test. Now I want to do some research on a new test. And doing this test on these 1000 people that test comes back either positive for that disease or negative for that disease. So if it's positive the disease is present according to my new test and negative the disease is absent according to my new test. Now let's look at the four groups we can form. Let's look at the behind the scenes we know that the disease is there. My new test that I'm doing can either come back positive. So it's a positive result and the patient really has the disease. I know that secretly behind the scenes. I can call that a true positive test. This new thing that I'm investigating is now a true positive. If in that group that I absolutely know to have the disease of this new test of mine comes back negative I know that that's a false negative. The patient actually has the disease. We know that but the new test that I'm doing my research on shows a negative result. In other words that's a false negative. And if I look at the group without the disease we absolutely know there's by some gold standard that there's the disease is absent. My new test shows a positive result that would be a false positive because the patient really doesn't have the disease. And if it comes back negative of course that's going to be a true negative result. So those are the four groups we can identify. Now how does that help us with what sensitivity is and what specificity as well sensitivity is just a very simple ratio between the two positives of my new test divided by everyone who actually had the test. And sensitivity is just a ratio between the two negatives of my new test divided by everyone without the disease. Okay so that helps us in deciding in deciding what test to use. How good a test would be to indicate whether someone actually has the disease versus how good would a test be to use to exclude those you know without the disease. That is where we use sensitivity and specificity. Here you see the little equations that I've written for this example. Remember a thousand patients let's suppose only 10 of them have a disease and an overwhelming 990 do not. I know this by some gold standard test so I can absolutely I can be absolutely sure that the 10 individuals have a 990 do not. So out of those 10 with the disease just imagine that 9 test positive that'll be a true positive and one of them test negative so that patient would have a false negative. From the 990 without the disease let's imagine 90 test positive those would be false positives and 900 test negative that'd be true negatives. Now I said sensitivity is the fraction between the true positives divided by the sum of the true positives and false negatives that would be the sum of everyone with the disease and the specificity is the true negatives divided by the true negatives and false positives. So that's actually all 990 without the disease and the denominator up therefor the sensitivity would be the 10 absolutely with the disease. Let's do those calculations just a simple python code. So my true positives were 9 my false negatives were 1 my false positives were 90 and my two negatives were 900 so that makes up my 1000 patients. So wat did I say sensitivity wise so it's the true positive divided by the sum of the true positive and false negative there's my divide by sign but I've got to do arithmetic order to this I've got to put these in brackets so this gets executed first before we have the division remember that's division so we're going to execute this so it's going to be 9 divided by 10 in essence and that gives I'm going to put that into this computer variable that I call sensitivity I could call it whatever I want specificity true negatives divided by the sum total of these two which is the sum total of people without let's use the print command here for python 3.x so it's print in brackets and I separate everything that I want printed in a line by comma so that's a string that I want to print and a string has to go in quotation marks the sensitivity is remember when I use commas it actually puts a space there so I needn't have put the space there then I want to print out this variable sensitivity but remember it's just a fraction so I want to multiply it by 100 comma and then a string the percent sign so if we were to run this code here let's run this print command the sensitivity is 90 percent again that would mean it will pick up 90 percent of people who actually do have the disease let's do the same for specificity I've calculated specificity there let's run that code and we see that the specificity is 90.1 percent so it'll pick up 90.1 percent of patients without the disease by virtue of a negative result okay that is sensitivity and specificity let's look at our mock data you're going to do this read underscore csv method so that'll take this file first read it inside of a pandas data frame and I'm going to call that this computer variable data as per usual I'm just checking if things are imported correctly there we go now let's make a decision let's decide that my histological evaluation where I've send appendix away for analysis by the lab under microscope can absolutely be my gold standard and it'll divide my patient set into those with and those without appendicitis and from those I'm going to make two new data frames so I'm going to call the one a pen underscore pause and that equals remember how to do this it's the data data frame the whole thing the whole name column column histo in these square brackets equals equals yes so that will be a data frame that only contain his histo positive patients and I'm doing the same for the histo negative pay close attention to the syntax here now I'm going to have a two positive all I'm going to call that from the positive group now what do I want to test imagine white cell count is my test under investigation and I'm going to say a test of a white cell count of 12 or more would indicate a positive test for me this is just a silly example so 12 or more is just going to be a positive test for me and less than 12 because you've got to decide where 12 goes on what side it goes so 12 will be less than 12 will be a negative result so here we go we have this computer variable and I'm going to do the this data frame only the positive patients and now I'm going to look in this column a pen square brackets quotation marks white cell count more than or equal to 12 that's the order in which we do more than or equal to 12 and I'm going to have a false negative where it's the positive data frame en the white cell count column where they are less than 12 and I'm going to do the same thing two negatives will be the appendix negative ones with a white cell count less than 12 and false positives will be the negative appendix patients but their white cell count was more than 12 excellent now we've got these four new data frames tp all f in all t in all fp all we just want to look at their white cell count and we now just want to count how many they are but we've just want to drop the nas now the nas would have been dropped but just for safety's sake so I see I have 87 true positives in the white cell count now they are 87 that remain in the false negative there's 31 in the true negative 18 in the false positive 11 so let's just do that I've made true positive false negative true negative false positive and I've just read it from the counselor as simple as that and I'm saying sensitivity equals two positive divided by two positive plus false negative and specificity is two negative divided by two negative and false positive so let's run that code and now I'm just going to use the print statement again use whatever I want the sensitivity of a raised white cell count in appendicitis this is why our study is the following sensitivity again times the 100 and there we go so the sensitivity of a raised white cell count in appendicitis is 73 percent so a white cell count of 12 or more going to pick up 73 percent of the patients who actually have so it's going to miss a few let's look at the sensitivity that means if I receive a value less than 12 value less than 12 is going to pick ups only 62 percent of the ones who actually do not have okay that's the way to see it let's move on to predictive values hey they they kind of more interesting now I'm looking at it the other way around I've got the test result back and my patient's test results either positive or negative if it's positive how sure am I that the patient really has the disease and if the test comes back negative how sure am I that the patient actually doesn't have the disease is this test predictive of having the disease if it is positive or not having the disease if it's negative now I want to warn you about reading positive and negative predictive values in the literature they are very sensitive to the prevalence of the disease in the particular study okay a study like this where we just looked at the appendices that were cut out and sent to to to the lab for histological examination those patients were super selected okay they do not reflect the incidence of appendicitis in the greater population out there so the prevalence of appendicitis is much higher in this study group because they're all in for surgery I've selected for these patients they they don't come from proper population distribution if I can use a term like that now the equations are very easy for the positive predictive value so the test comes back what is the prediction what prediction is there that the patient really has it this is all the positive so true positive divided by true positive and false positive and you can see why then it would be so dependent on our on our prevalence and then the negative predictive value the test comes back negative how sure are we that the patient actually doesn't have the disease all the negatives true negative divided by true negative and false negative so there we go remember before our example where we had 10 width and 990 without let's just run that code and just print that to the screen now look at this the positive predictive value is only 9 percent now remember it wasn't the sensitivity was 90 percent specificity was 90.1 percent something like that but look at the positive predictive value so we were quite sure that that was a good you know there was some there was a good test but now if the test comes back positive only nine percent of people will actually have now that's a very famous example I I use the people use this example quite a lot say for instance in mammography where we could say that one out of 10 female patients might develop breast cancer in their life it's a sad statistic and so so if you think about 10 with 990 without or you know that that that's already 10 over 990 that's like 1 percent I'm not quite sure what percentage I just quoted but anyway that's a more that's a true reflection of a population out there and if you now look at a positive result that comes back from this mammogram the predictive value of the patient actually having the disease is very low that's because purely set by the low prevalence of the disease in the larger population if the test comes back negative it really is a negative test it's got a 99.9 percent negative predictive value so that was more of it's not a absolute 100 percent true reflection the numbers I used obviously they are rounded so that one can just get some nice results but it shows you how sensitive positive and negative negative values so really positive predictive values are to the prevalence if we just use the same computer variables that we used for for the appendix data look at this a positive predictive value of 36 percent and or negative predictive value of only 36 percent and then a positive predictive value of 88 percent so these look a bit weird but there is a big problem with our prevalence good so I hope you understand sensitivity and specificity and positive and negative predictive values as I mentioned in the rest of this notebook you can read about how to correct for positive predictive value in your study how to then use data from the population the actual prevalence of the disease in the population and you can convert positive predictive value of your study to another population say the true population prevalence out there so I'll show you how to do that and then the rest of the notebook gets into the very exciting world of incidence cumulative risk and prevalence en the differences between those and then rates risks and odds ratios excellent