 All right, I think we're going to get started, everybody. Good morning. Today, we have three medical students, two of which are visiting us from out of town, one of which is from the University of Utah. First up, we have Eric Ostler. He's going to talk to us about statistical methods in ophthalmology. He's a University of Utah medical student, just rotating with us right now. So Eric. Thank you very much. So like you said, my name is Eric Ostler. And my talk today is entitled Statistical Methods in Ophthalmology, Unlocking the Black Box. So I think that oftentimes in ophthalmology, when we're doing our data analysis, it's kind of like a black box, because we don't always know what's happening inside of it. We sometimes will just present our raw data to the statistician, and he'll return the analyzed outcomes. And if we don't ask questions, and we may just assume that everything is correct, and this can cause a systematic problem, as the statistician has spent his time learning statistical theory. And we spend our time learning about the eye. And there's very little overlap in knowledge and interest. This means that if the statistician overlooks some of the subtleties in analyzing ophthalmologic data and a mistake is made, there's no way to catch the mistake. So I recently experienced this while working with a statistician. He recommended using five different techniques to analyze the data before we came to something that was an acceptable outcome. Each of his previous recommendations were either invalid, inappropriate, or they were inefficient. This is when I realized that there is a systematic problem, and it felt more like a guess and check than it did analysis. And I didn't realize the extent of the problem until I began a literature review. But before I go into that, I'd like to just talk a little bit about some of the background information. I think one of the challenges in analyzing ophthalmologic data is that the number of eyes does not always equal the number of patients. When this is the case, we're collecting two eyes worth of data from some of the individuals in the study. This means that the data is correlated. So two eyes from one person will always be more similar than two eyes from separate individuals. Thus, this data lacks the independence to use some of the standard analytic techniques. So some of these techniques that require that assume data independence are the T-test, linear regression, and analysis of variance. So when using these methods, we can't have any data within one group that contains two eyes from one person. Additionally, when we have a number of eyes that does not equal the number of patients, it's important to assess the level of correlation between the eyes. And there's a number of different methods that can be used for this, but one of the most common is the Interclass Correlation Coefficient, or ICC. This ranges from zero to one, zero meaning that there's no correlation within the data, and one meaning that the data is perfectly correlated. So and if this is not measured or not well understood, there's a high likelihood of using an inappropriate or an inefficient technique. So now I'd like to just discuss a little bit of my literature review. I looked at two different studies that were very similar. One was Murdoch et al from 1998, and one was Karakosta et al from 2012. These studies both systematically reviewed publications from selected journals, and they categorized the results by the analytic approach used. So despite being published 14 years apart, these two studies had results that were grossly similar. So they found that studies do not regularly assess the correlation of the data. They also found that many studies do not use all of the data available to them, meaning that they're inefficient. And finally they found that there are a large portion of studies that use invalid techniques. So here on the chart there's, this is a chart from Murdoch et al, and this classifies the 79 studies that they reviewed. So the first thing I would like to emphasize is that despite the presence of challenges in analyzing ophthalmologic data, there's only a limited variety of scenarios as we can see here. Next I would just like to break down this chart. The first category is analysis at the level of the individual, and this is seen when there's rare diseases that will only present in one eye, or in cases when it requires two eyes to make a diagnosis like strabismus. In this case, the number of eyes is equal to the number of patients. The next category is the studies that only analyzed one eye from every individual, and there's a variety of methods that you can use to select this eye. Either random selection, you can choose to include only right or left eye, or you can use clinical criteria to select. These methods are valid, but they're oftentimes inefficient, and they can cause an inclusion of bias. So the third category is summarizing data, and you can either use pooled data or take an average of the results. This again is valid, but it's oftentimes inefficient. The fourth category is analysis at the ocular level, meaning that there's two eyes for every patient. The first group did not correct for the correlation of the eyes, and this was approximately 20% of studies. So this is an invalid technique, and it increases the type one error by five to 20%. The second group did correct for correlation of data. This was only 3% of studies, and it was valid and efficient. So the final category is paired eye comparisons. I'm using the fellow eye as a control. This is a valid and a powerful method. So after evaluating each of these types of studies, I was able to construct an algorithm that simplifies the process for determining appropriate method of analysis. And I would just like to walk you through this real quick. We begin with continuous data, and the first thing we need to do is determine whether or not the number of eyes is equal to the number of patients. If it is, then we can use some of the standard techniques, meaning a T-test or linear regression. If the number of eyes is not equal to the number of patients, then we have eye-specific findings, and it's important to calculate the correlation in the data. Next, we wanna determine if the data is paired or unpaired. If the data is paired, then we have a fellow eye study as seen here, and we can determine if the number of eyes is exactly twice the number of patients, meaning that there's exactly two eyes for each patient, and if that's the case, we can use a paired T-test. If that's not the case, then we can use a method called mixed effects linear regression, and that will correct for the difference in patients and eyes. So finally is the case where there's unpaired data. So there's more eyes than there are patients, and it doesn't necessarily mean there's two eyes for each patient, and this is where a lot of the research we do falls, and this is where it gets kind of complicated. And I would like to discuss three types of scenarios. The first is when the ICC is equal to zero. This means that the data has no correlation, and if this is the case, the data is completely independent, and we would be able to use all of the data that's available, and because it's independent, we can use, again, standard techniques. The problem is this is never the case in ophthalmology. There's always correlation between one eye and the second eye, so we should not be using this method. The next scenario that I would like to talk about is when the ICC is equal to one. This means that the data is perfectly correlated, and that the left eye is equivalent to the right eye. If this is the case, then it's efficient to discard one eye from the data for each patient, and this, again, will cause independence, and we can use a t-test or linear regression. This, again, is very rare that we have perfect correlation. There are times when we would choose to use this method despite having an ICC of less than one, and use maybe clinical criteria to select, but that would cause a loss of efficiency in our study. So the final scenario is when the ICC is between one and zero, and this is most of the time. When this is the case, we can correct for the correlation, and then we can use methods like the Wilcox and Signed Rank Test, or the Mann-Whitney U-Test. So this algorithm is somewhat generalized, leaving out some of the qualifying information, but it does illustrate a correct flow of events. So I believe that using this type of a tool would simplify the selection process and improve our understanding of statistical methods. I think it enhances our ability to design powerful studies by eliminating potentially all of the invalid studies and optimizing inefficient studies and allowing us to use more of our collected data. I think this would also improve our ability to discuss our results and interpret the results of other studies, and this will help us better understand the strengths and weaknesses and prevent potential bias in our studies. And most importantly, we don't have to learn anything about statistical theory. So just as a side note, and so that you're all aware, the study design and biostatistic center has a multi-user license of STATA, which is a data analysis and statistical software. I was able to get a copy of this from them, and it's now available in the library on the fifth floor. So these are my references. Thank you for your time and for the opportunity to speak today. Does anyone have any questions? Yes, Dr. Olson. So it's interesting that because a lot of us are a little uncomfortable with the stakes happen over and over again, and another one is we say something isn't different because it's not, you know, so the odds are overwhelming that they weren't, but you had such a small set, you know, a chance of 5% is kind of the gold standard. And you'll have a study which will... Yeah, I agree, and I think that's why it's important to be able to use this correlation because if it allows us to use more of our data, then it will increase the power, which is something that we're not doing right now. When we have to discard data, we lose that power and we'll fail to show that difference. Yes? No, it's actually, well, it's shown in a few different papers how to find that. I didn't put the equation up, but it's very simple. Yeah. And that's what we're looking at. Maybe I didn't make this clear. The interclass correlation, it's looking at the correlation between eyes and one individual. So some factors are more closely related than others, but in general, we can find how closely the data is related. Yes, very interesting. I'll have to look into that. Thank you. Any more questions? Thank you very much.