 So, let us get going with a session on discussing issues with published research, but for now I would like to list the major problems that one sees when you read published research. So, this actually is motivated by a journal article which I saw which came out of few years back which made this very controversial claim that most published research is actually false. And so what I am going to do is to try and examine some of the reasons why a large possibility exists that what you read in the journal article may end up being false. So, of course, I should point out that as you pick up one specific article you can make no immediate comment as to whether it is true research or false research. By the point of this lecture is that you have got to look for specific problems with the way the experiments are designed with the way the data is presented. And therefore, be open to the possibility that what you are looking at and what you are reading is possibly false in terms of presentation. So, why is this a concern? So, the fact of this is that if an experiment is perfectly thought through and carried out that it could still give you an incorrect result. And a simple example is that coin toss experiment that I have been talking about all along. If you take a fair coin and you toss 100 times and you expect 50 heads to turn up, you carry out an actual experiment with 100 tosses and you do not get the 50 heads. Now, at first side this does not the experimental result that you have does not match your expectations. So, in other words you have spent a lot of time talking about random variables and random experiments and how therefore, you might end up with an extreme set of measurements by sheer chance and consequently you end up in a situation where rather than trust the original hypothesis behind an experiment on observing an extreme result, you now tend to believe that some other event or some other cause is behind your hypothesis. Some other cause is the reason behind which you have seen an extreme result. So, for example, if you got 20 heads out of 100, you have said that we think that we should change our belief from the coin being a fair coin to now saying that the coin is a biased coin. So, therefore, it is possible that because you do not normally repeat your experiments that you might have ended up getting extreme measurements unfortunately. In which case you will come to the wrong conclusion with your analysis, in which case you will end up publishing and final conclusion relative to your hypothesis which ends up being false. So, in the best of circumstances even if you have done the best experiment that you can and you correctly defined a hypothesis and correctly carried out an experiment, what statistics tells us is that randomness in sampling could still end up giving us an incorrect result. And of course, in statistics you spend a lot of effort trying to figure out how to minimize the chances of getting an incorrect result, but you cannot rule out that the experimental result that you saw is incorrect. Now, on top of that if it turns out you have not correctly designed an experiment that is going to definitely make things worse. So, if you have been sloppy in the way of defined the hypothesis, if you are sloppy in the way of controlled your experiments, then things will end up being worse. And finally, if you are personally biased towards some outcome, this will make things far more worse. So, if you are entering into your research with some preference for one outcome related to another, then chances are that the entire procedure, the entire experimental procedure that you are following is already biased towards getting you the preferred outcome, in which case what you are finally publishing is probably false. So, these are three practical concerns all the time, which require further analysis and you have got to systematically ask whether these three issues matter. For example, this issue of personal bias is something which now journals require you to confess to. So, if you think you have a personal bias, let us say that the results from your experiments are something which are something from which you will personally profit. Maybe you have got a start up company based around your research and therefore, it is in your interest to claim that your research and your results turn out a certain way. Then obviously, you are biased towards a particular outcome and these days journals require that you confess that you have a personal stake in the research that you have done. So, the practical result of these three scenarios, where a perfect experiment does not give you a correct result or you incorrectly design an experiment in the first place or even worse you are personally biased towards what you think the result should be. The net result is most published results tend to be biased and in fact, cannot be reproduced. So, this brings up the issue of bias and let us evaluate a little further as to what are the causes of bias. So, these are terms that you can look up in detail a little later on. In fact, if you go to Wikipedia, you will start getting good links to several of these possible issues with experiment design, but I will give you one or two quick examples, which will serve the purpose for today. Let us start with what is typically the most common reason for bias, which is what is called confirmation bias, which is that you are personally biased towards a particular outcome having seen some data, but there are other reasons why bias exists. So, let me list them all out first and then I will take these up one by one in terms of specific examples. So, the reasons why we have bias include the possibility that we have a confirmation bias or a preference towards a particular result or that we have badly designed an experiment in the first place or are manipulating the analysis of the experiment. The third reason is that because our results of bias because we have started off with some incorrect model or that we not knowing better performing an incorrect analysis of our experiments. The fourth reason why we have bias in a presentation of a research is that the data that we have collected is being now incorrectly depicted in terms of plots and graphs. The fifth reason is that the results that you have come up with have been generated by you alone and have not been tested independently by somebody else, by somebody else who has no bias and stake in the outcome of your results. So, these typically are 5 reasons they can be more, but these are 5 reasons and these 5 reasons I will take up one by one in discussing issues with regards to problems with published research. So, start with the problem with confirmation bias. So, confirmation bias has to do with the tendency to favor information that you think agrees with your beliefs or hypothesis. So, how does this happen? So, the typical situation where confirmation bias happens is for example, in that lab experiment where you are supposed to generate a bunch of data points and then fit a straight line and then you realize that some of the measurements you have just carried out have so much error in them that they are not going to fall on the straight line and then before somebody else sees it you quietly pretend that you never got that data point in the first place. Of course, that is dishonest, but it is basically a situation where you are trying to erase the data point from your notebook simply because you already believe that a line should take on a certain shape with a certain slope. In other words, you think you know the result, you already mentally have a picture of how the straight line should look like and relative to that straight line you are now in a situation where you think that your measurements do not have a chance of falling on the straight line. So, you quickly pretend that you never did that particular measurement because if you include it in your analysis, you will end up with a result which no longer strongly agrees with your original model. So, this then is confirmation bias and it turns out it is not just the undergraduate lab student who is guilty of confirmation bias. Many famous scientists have been guilty of this. For example, if you go back in history and look at somebody like Gregor Mendel and Gregor Mendel is famous because of the laws of heredity which he came up with after studying how pea plants hybridize and grow. He came up with the three is to one rule of how traits could be transferred from parents to children. It turns out that Gregor Mendel nowadays is being accused of confirmation bias and to put it mildly if Gregor Mendel tried to publish his data at this point in our history, he probably would have been rejected by a journal publisher by a journal editor. So, let us try to look at that a little closely and see what precisely is problematic with Mendel's law and why is there bias in how Gregor Mendel did his research and published his research. So, what Mendel was talking about is how genetic characteristics propagate through different generations. So, let us look at the cartoon on the right and the cartoon on the right has three levels. On the top you are looking at the color of a particular flower. So, the flower is to be found either with white petals or with red or with red petals. So, the color of a particular flower let us say is a function of the genes coding for color and because of the way genes are inherited, you are going to have two copies of a particular gene coding for color present in a particular flower and each of these genes is obtained as one copy each from each parent. So, the whole idea here is each parent contributes a gene and depending on the final combination of genes that you have, you are either going to have yourself a white flower or a red flower and further Mendel came up with this insight that if you had the gene coding for red color, that gene would dominate and so therefore, given a combination of white and red genes, you would end up with the final red color developing in your petals. So, the red is claimed to be dominant given a combination of white and red. So, that is Mendel's hypothesis and given that hypothesis, he goes about analyzing pea plants and Mendel was basically a monk working in a church. The church had a huge plantation and he had the ability to systematically grow pea plants generation after generation. He literally studied something like 16,000 generations of plants. He kept crossing them trying to see what kinds of new plants would develop, short leaves, long leaves, short stems, long stems and of course, the size of the petals and the colors of the petals. So, in this particular example, if the first generation involves one parent and that is this over here, one parent with two white genes which is why the final petal is white and the other parent has two red genes which is why the petal turns out red, then it turns out that when you look at the offspring or children from this first generation, you can end up having four possibilities. So, here are the four possibilities in the second level and these four possibilities because they are inheriting one gene each from each parent will have a combination of white and red and because one of the parents had two whites and the other parents had two reds, it will turn out that all four children in this first generation, in this first generation of descendants, all four children will have a combination of white and red and of course, because red is dominant, the flowers will develop a red color to the petals. So, given white and red, you will show up as red. So, therefore, now in the first generation, you seem to have lost the white color in an offspring, but Mendel's insight was that if you now watch, then this particular middle generation end up hybridizing between themselves and they form the third generation, then what happens? Now, what happens is if you are starting with two parents from this generation with one white and one red gene each, you end up having four scenarios where you take the red gene from one parent and the red gene from the other parent and therefore, you end up with two reds and therefore, the color shows up red and then you have these two other scenarios along this diagonal, where you end up with one red from one parent and a white from the other parent, in which case the flowers still show up red and then finally, you have this scenario where you have pulled out the white genes from both parents and therefore, you end up white. So, Mendel's insight is that you will end up with a ratio of 3 is to 1, so therefore, three red petals versus one white petal. So, you will end up with a ratio of 3 is to 1, if you inherit genes according to this model and of course, we now accept this theory, it is taught to students in biology all the time. These are Mendel's laws of segregation and inheritance, but when it comes to data analysis and presentation of results, there is one huge problem which is that if you work with thousands of people and you ask what fraction of them have red petals versus white petals, that ratio which should have been the ratio of white to overall should have been 1 out of 4, because the ratio is 3 is to 1. So, 25 percent of the flowers in this generation must have been white, but if you do the actual experiment collecting thousands of plants and you look at the proportion of white to total plants, the actual experimental result will turn out not to be precisely 25, will turn out to be a value close to 25 and why is that? The reason we do not get the precise value 25 is because we are going to have random sampling error. So, this random sampling error has to do with the fact same reason that we saw that we would never get 50 heads out of 100 tosses with a fair coin. It is very unlikely that you will get 50 heads out of 100 tosses, it is more likely that you will get 48, 49, 51, 52 and so on. You will see values in a range, getting the precise value of 50 out of 100 was unusual and similarly getting the precise value of 25 percent of a flowers to be white is also unusual, though it is a correct theory. So, in other words the problem with the way Mendel presents his data in his notebooks and finally, as he published it is he came upon this precise value of 25 percent without proving that the actual experimental data was a value or a proportion close to 25 percent. So, therefore, if somebody else now repeats his experiment it is a very unlikely, it is very very unlikely that they will get precisely 25 percent in terms of the proportion of white to overall numbers of flowers. So, the moral of the story is that Mendel came upon the right theory and ultimately hit upon the right insight, but when he published his results he did not acknowledge the fact that there could be experimental error and randomness in the data that he collected and therefore, he should have actually pointed out that the true trend for fractions is 25 percent, but that values near 25 percent were not to be considered unusual, but in fact just confirming the true ratio of 25 percent. So, this is an example of confirmation bias because Mendel was biased towards an outcome which would prove his theory of 1 is to 3 therefore, 25 percent as a proportion of white flowers to total flowers. Now, there are other reasons for bias and a major reason why there is bias in research is because there is some financial interest or prejudice towards a particular result. I just give you a hint of how people who have startup companies around the research that they are doing end up publishing relatively biased results because they are trying to showcase their data and their technology in the best possible fashion without acknowledging that other possibilities and other explanations might exist for whatever it is that they are studying. A good example of this is of this case where there is bias and prejudice when research is done has to go back to this issue of whether tobacco is addictive. So, in the US for example, in the late 1900s there is a lot of research which was published to try and claim that tobacco was not truly addictive. Of course, we do not accept that now and we insist that packs of cigarettes have warnings about addiction and about possible health effects. But way back around the 1950s to 1980s it turns out that the tobacco companies did a lot of research as to what would make a perfect cigarette at what to what extent is nicotine addictive and if you look at most of those publications they give you the impression that nicotine and tobacco use is not addictive. So, this is obviously a case where you are biased in terms of the outcome that you want to project and of course, if you look closely it turns out that the bias arises because the publishers or the sponsors of this research were themselves tobacco companies. So, the tobacco company was sponsoring researchers to do research which would end up showing that tobacco was not addictive. When independent researchers later on repeat the same experiments you quickly realize that not only is tobacco and nicotine dangerous to you, but in some cases that the nicotine levels in cigarettes are actually enhanced over the natural levels to maintain addiction. In fact, that the tobacco companies way back knew precisely what levels of nicotine would cause addiction and going through the old notebooks has revealed most of this hidden research or unpublished research. So, in other words where there is money involved and there was big money with obviously selling tobacco products. So, where there is money involved it is very likely that the researcher is biased and the researcher personally gets biased because of the need for money to carry out research and if somebody is offering you large sums of money to come up with one particular outcome to your research it is possible that you will end up falling for that and end up doing bad science. Another possible issue where another possible scenario where you see lots of issues with the design of experiments and with how the data itself or analysis itself gets manipulated is when the actual area of investigation is new and hot. So, if there is a prospect of a boom in a particular field particularly let us say with medicine where there is a prospect of making new drugs for a particular disease because of a new breakthrough technology. Then you will find a lot of people rushing into that area to try and do research a lot of money is pumped in and because a lot of money is pumped in there is a lot of pressure to come up with results which seem to prove possibilities with that particular area of research. For example, in India of late we have spent a lot of time trying to figure out whether we can replace petroleum with biodiesel and biodiesel in particular from produce from sources like seeds oil seeds. So, jatropha and so on and it turns out that a lot of research and a lot of money has been sunk into that research and at the end of it all it quickly turns out that there is no prospect of making large amounts of diesel by squeezing oil out of seeds. In other words we cannot fully replace petroleum with biodiesel. Now, this could have been systematically analyzed before the investment was made in such kinds of research it was not done. I am not trying to say that biodiesel should not be made from seed oils all I am trying to say is that the biodiesel made from oil seeds will not fully replace petroleum as a source of fuel and this could have been analyzed by systematic computational analysis. So, when the field is relatively hot and because petroleum prices are rising is a need for alternate fuel. So, the field is hot for alternative fuels people rush in and publish without regards to the actual need or proper experiment design. So, here is an example of manipulation of data the precise domain where this data comes from does not matter. So, what is being done here is a large number of data points have been collected and what is being claimed is a linear model between y and x and the r square value for this plot is reported and you get a value of 0.56 and 0.56 at first sight does not seem so great. So, if you remember my discussion on regression what you want is an r square value closer to 1. So, 0.56 if it looks like 0.56 is not so great and something that several researchers end up doing and it is not necessarily known to the rest of the world. What they end up doing is that they remove some of the data from the data set and end up showcasing this reduced data set which promises a better fit to the line. So, the r square now if I throw out the middle part of my data set gives me a value of 0.85 and 0.85 seems to claim a better linear model which of course means that whatever hypothesis you are claiming between y and x is now stronger because you have got a better fit between y and x. But naturally this is a very dishonest way to showcase your data. So, the moral of this particular slide is that you really ought to be showcasing all data that you generate in your publication. You cannot be selective about which data points you keep or leave out when it comes to presentation in your plots. Again there is a possibility that you are selectively biased towards a particular result and in doing so you try to leave out data which is irrelevant at least to your way of thinking. Another problem which hurts the interpretation of most published research is the use of an incorrect model or analysis. So, it is quite often the case that for a particular phenomena that you presume that the data needs to be fit by a certain model of a certain type could be a polynomial of a certain order and then the question gets asked what kinds of model parameters will show up as you fit that model to your data. So, the problem with this line of thinking is that you do not randomly search for a curve which your collection of data points. What is important is that you fundamentally ask which physical model or which physical expression shows up when you try to fit the data. In other words if your theory says that it is a straight line model that you must apply to data then if it is straight line model and not start hunting around for polynomial models as better fits to your data. So, in this particular cartoon what I am showing is from left to right four different attempts to fit a data set. So, let me explain how this data set itself is created and then you will realize how people tend to get fooled when you fit what are claimed to be better fits to data. So, the way this is done is you take a sign curve and that sign curve is in green and that sign curve is shown to you in each of these subplots. Take a perfect sign curve then pull out points from the sign curve and to these points add a little scatter. So, there is now seem to fall near the sign curve, but on either side of the sign curve. So, you have for yourself a collection of points which seem to approximate the sign curve, but if I now ask you to fit instead of the sign curve if I ask you to fit a polynomial to now just the data points. So, ignore the green curves for a moment fit to this collection of data points polynomials or different order, order 0 which is a horizontal line, 1 which is a straight line with some nonzero slope, then 3 which is a cubic and finally, a polynomial of a much higher order in this case order 9. So, you have now four different models which you possibly could fit your data just the data points and you can quickly see that as I increase the order of the polynomial those red models that I get the curves that I get on doing my regression those red models as I increase order of the polynomial start passing more and more through the points. In other words at the extreme left I have a linear flat model which does not seem to pass through most of the points and as I increase order of the polynomial from 0 to 1 to 3 and 9 more and more points seem to fall closer and closer to the red line and therefore, I would claim a better and better fit of my model to my data. So, the problem with this kind of an analysis is that while the model of order 9 on the extreme right seems to be the best model fitting the data because it passes in fact reach in every point that I have in there it is actually the poorest approximation to that sign curve and remember I started off with a sign curve from which I made up a few data points to test how well my regression models work and if you look in fact at the mid at the at the cubic plot the third from the left you will realize that the cubic is a far better approximation of the green curve the sinusoid than the higher order polynomial. So, in other words in many publications we are guilty of fitting a much higher order polynomial than is necessary and then making the claim that it fits our observations very well when in reality a much simpler model would have probably been a better fit despite it not precisely passing through all the data points. So, this actually is known in history as a phenomenon or as a theory rather called Occam's razor in fact it goes back around 800 years where a gentleman called the Duke of Occam is thought to have said that given competing models which might possibly explain the same observations choose the simplest model which does the best job of explaining the results in other words do not unnecessarily over complicate your model. So, there is a merit to going with the simplest model ok. So, in with in my cartoon above the simplest model is the cubic the simplest model which does the job is the cubic. So, go with the simplest model which explains what observations you have with your particular data set. A fourth problem with representation of data has to do with incorrect plotting or incorrect data representation. So, I am showing you here now a couple of scenarios. So, in the first scenario you see a couple of bars and hopefully you are able to make it out there is an error bar thrown on top of the bar of each bar. So, on top of the blue bar you might see a faint black error bar and there is a similar error bar on top of the green bar. Now, this is normally the kind of plot you create when you want to compare two scenarios you are comparing let us say one control situation that is the blue with one new test condition the green. And a conclusion from this sort of an experiment would be look the green is significantly different from the blue condition ok. And you would publish this kind of a data trying to showcase that the green scenario is totally different from the blue scenario. But the problem with this is when the moment you plot your data like this you have not actually correctly represented all your data. And it turns out that the true data which went into the making of this plots is now shown to you on the right. And if you look at it carefully you will realize that the blue bar was constructed from only two data points over here and that was enough to give you an estimate of the mean and an error bar. And similarly the green was also constructed from just two data points with an error bar. In other words one does not have much confidence in the precise values that have been estimated during the course of this experiment ok. So, plotting it with the bars on the left would give you the impression that a lot of measurements have been taken and you are looking at an average from a number of measurements. Whereas the truth is that only two measurements go into the creation of each bar as seen with the data on the right. So, this is typically a problem with how data is constructed and presented. And a similar situation is now shown to you on the two subplots at the bottom. So, again there is an attempt to compare two situations the purple bars on the left lower subplot. And again the intention is to say that the purple bar on the right is significantly different from the purple bar on the left. But the point is the moment you showcase your data as these bars with a certain height and with an error range around the top. The fact of the matter is you are not actually conveying as much information as you could. And if you look at the subplot on the lower right what this is doing is actually showcasing all your actual individual measurements and those are the dots that you see in the plot. So, not only are you showcasing your individual measurements and therefore conveying to the reader what spread of measurements you are likely to see. But you are also proposing what you think for each situation is the average value and you are also showcasing an error bar around this average. So, what is superior about the representation on the lower right is that you are showing at one shot both the individual measurements and also what you think is the mean and variation in your data. Of course, the purple bars on the left do not bother to do that. So, the idea with showcasing the raw measurements is where you think the reader ought to have a chance to interpret the data on their own using the representation on the right you are giving the reader a chance to do that. But if on the other hand you do not wish the reader to interpret this and that is actually a bad thing. Then you end up trying to mask or hide a lack of significance in your results by presenting your data as it is shown on the lower left. Finally, one of the issue which hurts most research as published is the fact that is research is done in isolation. So, experiments are done by who it is that thinks up the experiment in isolation and it is very rare for critical experiments to be repeated by different investigators. I will come to this a little later in terms of a case study. But for now I will instead move on to pointing out that in terms of testing how robust your conclusions are from your data. One of the things you really want to do is to cross validate and rather than get into an elaborate explanation of cross validation for a practical situation I will stick to a simple regression example. So, given a collection of measurements the blue points in this plot normally one would use all your measurements to come up with a straight line and you are seeing that as that red line on this plot. But you are invariably left to the question as to how do you know that your model the straight line red model is true. So, a slightly better way to go about fitting models to data is rather than use all your data at one shot you do this in a slightly different fashion take half of the measurements from your complete data set for example, take half of your measurements and you do this randomly create a random subset of measurements and then fit a line through this half of measurements. So, you are using only half your data at a time to build a straight line and once you have done that you go back and ask how do the remaining data points behave relative to the line that you have just created. So, in other words you have a training set of data to build a line and then you have a separate test data set to test your line and what should really happen is the line you have built using your training data should continue to behave well with your test data and it turns out that your test data does not pass near your line does not or does not fall near your line you probably actually got a bad line now of course, there are different ways in which you can construct a subset of data points there are many different ways in which you can do this. So, therefore, one does this in random fashion several times will get several possible lines from using only a limited subset of measurements at a time. So, generate all these lines and take an average of those lines and that average probably is actually a superior estimate of the true model which should have fit your collection of data points. So, cross validation of your results is something that is very rarely done because we are in hurry to present every single measurement that we make as something which conforms to the model that we have built. Now, the hypothesis testing procedure which are defined below before actually suggest several things when it comes to the research findings. So, I am not going to get into the mathematical details of this, but simply I will highlight issues which can actually be reasoned out from rigorous understanding of hypothesis testing. So, it says that most. So, what I say here is that most research findings are less likely to be true when the studies that you are carrying out a small in other words you have very few measurements in your experiment. So, the few of the measurements the more doubtful your results and conclusion to some extent that is common sense, but that can be rigorously validated computationally. Your results are also less likely to be true when the two situations you are trying to contrast by experiment are small. So, what is the point of this? The point is that the moment you have got a model to test out the model you are going to have to ask how does the model behave or what does the model predict in a certain circumstance. And so therefore, if you are going to take two scenarios and hope that the model clearly allows you to discriminate between the two situations. The two different situations must have quite significant differences in what is to be expected of them and if the two different situations themselves are very similar in other words if the effect size is small and this is a good chance that whatever you are claiming is probably false because you are not able to distinguish between the two scenarios. A third issue which can be problematic in a hypothesis testing procedure is that if you are out to test whether y is related to x. You have a good chance of figuring out the precise relationship between y and x if at most there is only one z out there which might influence or mess up your interpretation. In other words this goes back to the discussion of latent variables or hidden variables from my previous lecture. So, these hidden variables always run the risk of possibly explaining whatever you are seeing rather than y and x themselves turning out to be the causes of that particular phenomenon. So, now what I claim is that if there is a very large number of hidden variables which are unaware of then what you are observing in your experiment is probably actually an outcome or is something very strongly influenced by all these hidden variables as opposed to being influenced by that particular variable which you are trying to control in your experiment. So, therefore, the better controlled your experiment with more variables under your control the more likely your results are true, but in science particularly for example, in biology where we do not understand the workings of the components of a cell it is very unlikely that you have actually carefully controlled an experiment. Therefore, you run the large risk at what you are seeing is actually a conclusion influenced by some unknown hidden variable and not something influenced by what you controlled in your own experiment design. The final problem relating to hypothesis testing goes back to the fact that when you have large flexibility in the way your experiment is done and too many things can change on you. When you have large flexibility in the outcomes possible with your particular experiments or with the approaches that you are using to analyze and measure the components of your experiment then again it is quite likely that your research findings are not trustworthy. So, the point very simply of this in this is that if there is too much variation happening with everything that is going into your experiment for example, your reagents are of variable quality your instruments are of variable precision then the chances are that whatever it is you are trying to conclude are probably incorrect. So, let me go back to this Amgen study and this Amgen study is a very intriguing example of how research published research can end up being incorrect. So, Amgen is a preeminent biotechnology company. So, multinational based out of the United States, California this is a company which makes a very large number of bio pharma drugs and they have interest in particular in cancer drugs and because of this they have been following research coming out of the major research labs in the United States in particular, but also Europe and as an attempt to identify new drugs they a few years back decided to do a study looking into the results coming out of some top notch research labs in the United States. So, I am now talking of labs like the ones in MIT, Stanford, Caltech. So, these are the famous research universities and so here is a situation where there are lots of new novelties in science in cancer science being proposed by labs in these universities lots of famous papers being published landmark papers being published particularly in oncology and hematology and Amgen decides that there is enough reason to investigate whether the new research coming out of these labs is relevant and likely to give them a hint as to whether new drugs may be coming from. So, they therefore take 53 famous landmark papers published recently and then decide to repeat the research in their own labs and then end up with a staggering result out of 53 papers where the experiments are repeated only 6 papers can be replicated in terms of the experiments done. And we are not talking of labs where there is a lack of a reputation of good research these are top notch universities and top notch research these are places where the research has gone to win nobles in medicine and yet here is a company saying that it is capable of only replicating 6 results out of 53 when it comes to repeating or claiming the findings from these university research labs. So, this of course is problematic and this is the case with some of the best research labs you wonder what the percent rate of replication is with other research labs. So, why is this so when the final interpretation of the study is done why do the 6 cases work and why do the remaining 47 cases turn out to be untrue or not reproducible. So, in the 6 cases it turns out that the investigators of those particular 6 papers were behind those 6 papers they had paid close attention to a few factors in terms of experiment design and they had paid attention to how to systematically control an experimental hypothesis. In other words if the objective of doing that particular research was to look to see whether a variable had a particular effect they made sure that they controlled for that particular experiment by looking at scenarios where you do not expect an effect. So, they paid attention to controls whereas others had not done that they paid attention to reagents and in biology and chemistry it is quite often the case that the reagents you are using are not reproducible in terms of quality of these reagents. So, it is known that one batch of a reagent might have a slightly different activity compared to a different batch of a reagent. So, in this case the researchers had made sure to repeat their experiments using different batches of reagents and had then worked out what the average outcome would be and they had not bothered to do the experiments with only one quality of a reagent which later on would not be reproducible simply because that grade or quality of that reagent is no longer available for purchase. So, they took care to make sure that the quality of reagents was not influencing the outcome. Third they correctly acknowledged and accounted for the possibility of bias in their particular investigations and as I told you earlier nowadays most journals require that you acknowledge upfront whether you are likely to be biased given any financial benefit that you might obtain in the context of how you are carrying out your research and in the design of your research. And finally, the other feature of the correct studies was that they had correctly used a complete data set in coming to conclusions and they are not been selective in throwing out data points and they are not been selective in trying to use a subset of data points towards fitting the biased conclusion that they had a preference for. So, it turns out that the issues that I have raised so far affect not just the common researcher, but it affects research at the very top of the research pyramid. So, as we now look ahead in terms of what are good practices in the carrying out of experiments, what makes for a good experiment design. So, I will just give you a set of tips which in general are recommended. So, the first point is to worry about whether the model in general makes good predictions because remember you want to really set up experiments to test a model and that depends on what predictions the model comes up with. And if it turns out that the model does not predict a wide range of outcomes and you are probably not giving yourself a chance to even do a good experiment which can distinguish between different scenarios. Next you have got to ask whether your model is being developed to the appropriate degree. So, I go back to that business about a polynomial model. Do you really need a complicated model to explain what might turn out to be a simple phenomenon. So, do you have the appropriate complexity to your model? Are you over complicating things or are the other extreme, are you over simplifying your model? So, is the model appropriate given the kinds of data you hope to get and in turn is the data that you hope to get with all the error in it likely to differentiate between different types of models. Do you think you are going to be in position to predict and measure the effects of a particular cause? Are you carrying out the appropriate negative and positive controls in your experiment? So, if you claim to have a positive effect, how do you demonstrate the positive effect? Are the independent ways of demonstrating the positive effect and alternately if you remove that particular variable which causes the positive effect, do you indeed end up then with a negative effect? So, do you have the appropriate negative controls? Are your experiments or measurements available and repeatable to check for the model assumptions? In other words, there is no point coming up to the model where you cannot do an obvious experiment to test the model. Does your model now hold if you change the conditions under which you carry out your experiments or if you marginally change the assumptions behind a model? In other words, how sensitive is the model to the basic assumptions that go into the creation of the model in the first place? So, therefore, my insight in this is that a simple limited model which you can actually test out is in many ways preferred to a very complicated model which you are unable to test because it does not have good predictive power. So, you basically do not want to spend a lot of your energy and for that matter money and time on a model which cannot be experimented with or which you cannot put yourself into a position where you can verify it. It is essentially a waste of time speculating on models which you cannot evaluate in any fashion whatsoever. Now, another issue which some of you have raised is what to do with negative data. And so, negative data now actually relates to this situation where you do an experiment with a hope of confirming some hypothesis that you have and it then turns out that the data does not exactly support that hypothesis. And so, that is what we call negative data and it turns out quite often in fact most of the data that one generates in research probably falls into the category of negative data because it does not strongly confirm something enough to for you to claim beyond a shadow for doubt a new theory. So, if you got negative data the tendency is not to publish negative data and in fact unfortunately journals prefer new conclusions and therefore positive results with positive data. So, journals would like to publish that you have discovered a new cause for a disease or that you have invented a new more efficient way of communication. So, there is a large body of negative data that one collects which never finds a place for publication. But the fact of the matter is a lot of good science happens as a consequence of collection of negative data because it is the act of collecting negative data which ultimately allows you to refine your idea of what is the true underlying model or the true underlying hypothesis. And you therefore need to come to this point where you so basically one of the recommendations now is that in a publication you should make an effort to include all your measurements including the negative data because when you publish and try to convey a story to somebody else it is important not only to show the glamorous results arising from positive data, but you should also show to what extent you are going to get negative data and it is a combination of the amounts of negative data and positive data which gives a true reflection to somebody else as to what is the scope of to which a particular theory might explain an observation. I will kind of wrap up by listing for you the combination of guidelines that several journals now prescribe when it comes to publishing your data. Most of this is focused on statistical data or presentation of statistical data, but it is true in general for other domains as well. So, where you are in doubt about the design of an experiment or with the presentation of a data journals require that you actually consult a statistician and in fact in several journals as articles come in for review one of the things that an editor would ask of a reviewer is whether a particular manuscript needs to be reviewed not just for the theory or the science behind it, but also whether the data being presented in that manuscript needs to be looked at by a statistician. So, therefore, knowing this it is a good practice for you to consult a statistician as you go about putting together a manuscript. You need to define and justify the significance level to which you hope to make a claim about a new observation. So, if you are claiming something new it is your job to prove that something new is indeed significant and not a random observation that you might have seen by sheer chance. So, there is a significance level that becomes relevant for you to define and justify and that is usually a function of the domain you are in. Where you have used statistics you need to clearly identify the methods used and refer to the methodology using either textbooks or review papers and that is basically so that you do not get tangled into how you have done your actual calculations. Where you have got multiple events happening there are multiple variables involved in your analysis demonstrate that you have controlled the best you can for the influence of these other variables. So, prove that if you are going to do multiple comparisons that you have exerted control in the design of your experiments. Report variability using a standard deviation report your measurements the best you can to the appropriate significance and also report variability not just the average value of an observation. That standard deviation indicate also if necessary in terms of error bars to your data using confidence intervals which was those intervals that we saw during hypothesis testing. Then it is also important to report a precise p value and that p value was that measure in statistics which was the probability of you changing your mind about that scientific conclusion that you are coming to. So, the p value is important to convey to the end reader simply because if you do not provide this p value the reader is left with just two possibilities in terms of the interpretation of the data either your publication is absolutely true or your publication and your experimental data is absolutely false. But if you provide them an estimate of the p value the reader now has the possibility to personally interpret the significance of your data and therefore as the ability to agree with or disagree with your interpretation of your data. So, it is now a required practice for you to report a p value something else which was brought out in several talks now in several sessions is that when you present your data make sure that the number of digits that you are presenting matches the scientific relevance of whatever it is that you are measuring. In other words just because your calculator or the computer is capable of generating numbers for you with 10 or 15 decimal places does not mean that all those digits are important to you. So, there must be a discussion of how many significant digits you have with your measurements that might come about based on least count of whatever device it is that you are using for your measurements. So, use that insight to restrict your analysis and your measurements and your conclusions to just the right number of significant digits. Finally, interpret your results your final conclusions regarding the hypothesis given the confidence interval bounds and the p value. So, in other words it is your duty to not just report an average value a standard deviation and a p value. So, it is also your duty to write down what you think your interpretation is of all of these numbers. So, I provide a collection here of articles which are of use to you as you go about trying to understand some of these issues further. So, I will start with the this first article which actually has been a very provocative article which claims that most published research findings are false and this is basically a simple statistical argument and I will simplify it down to this. If there is a certain probability of a research paper being correct and I will tremendously simplified down by saying let us take 20 people publishing 20 papers on some theme. Let us say that each paper has a 90 percent probability of being correct. So, therefore, there is a 10 percent probability that each paper is false. So, when I now look at a collection of 20 papers what is the likelihood that I have a paper out there which is absolutely false and that is actually easy for you to work out and I leave you for you to reason it out, but it is actually easy to prove that there is a very high probability that one of those papers out there out of those 20 is a absolutely false and I say this without even looking at the science in that paper. So, if I have an estimate of how many bad papers are there as an average proportion of bad papers that is enough for me to predict in a collection of papers how many bad papers might be published in a particular issue of a journal and this is not even taking into account the possibility of bad experiment design, the possibility of actual bias in how the experimenters have gone about it. So, this is again an article that is worth going through this is online publication these are online journals PLOS you not only want to go through the article, but you also want to go through the discussion that has followed subsequently because obviously this has been a provocative article there have been many follow up articles to this and you gain a lot of insight in understanding how people are concerned therefore, about how their particular experimental designs could be understood as false and that has been the value of this particular journal article ever since it got published in 2005. The next link is about how to display data badly and you really want to go to Google and look this up you will end up with many hits look at the top few hits and you will quickly realize in how many different ways you could end up badly displaying your results. So, you have had several talks on presentations of data you can how to display your data they normally focus on how to display your data well, but it is also useful to figure out how to display your data badly and so you can approach this topic from the reverse direction and realize in what typical ways data is presented badly and these are obviously things that you want to avoid. So, the third entry in here is a good book which provides you with insight as to how to go about creating appropriate and accurate presentations of your data when it comes to conveying that particular scientific concept that you wish to convey and there are several cartoons that you can find at that final website in particular I am giving you a link to one cartoon. So, visit that website and you will get a lot of interesting cartoons which give you an insight into how statistics can be abused when it comes to coming up with conclusions. So, with that I will end with the discussion of how published research tends to be typically biased and false. Before I address some of your questions I have one final slide where I wish to introduce to you yet another MHRD program. So, in addition to the various talk to a teacher in the spoken tutorials kinds of activities and the workshops that you are currently doing MHRD has also sponsored a program on creation of virtual labs particularly for engineering students for the various AICT recognized engineering colleges around the country. And the basic idea is this several of the engineering labs in some of these colleges have relatively poor infrastructure and so the question has come up as to whether we can come up with simulators and remote triggered hardware which would allow students at remote locations to access content these are now what are called virtual labs. So, professor Kandan has probably already introduced you to a particular virtual lab called a single board heater. But in addition to that there are now several experiments which have been created in various engineering disciplines and also computer science these are available for use now you can visit this one website on vlab.co.in and for those engineering colleges out there acting as centers and for those engineering faculty it would be very useful to us if you visit these sites go through the experiments and start providing us feedback. I also provide a forum where some of these things can be discussed and where your feedback can be relate back to us. And the idea is that again as with the silab activities one of the things that we want to do is to start including more and more remote centers in terms of both creating new simulator experiments. But also helping us put together more remote triggered hardware experiments. So, this now plans is plan to be an expanded activity where we take a few simulators that we have developed and patterned on these simulators we look out to remote centers to help us create more content. So, essentially we are looking for students and faculty at remote centers to come up with ideas and concepts which are worthy of being simulated. And we are looking for developers who can help us out with silab, python and with other open source programming languages for creation of these simulators. So, to learn more about these get in visit these websites get a feel for the kinds of activities that are already simulated for the and for basically the scope of a particular experiment. And then get in touch with me either by email at this particular email ID or at this discussion forum. And as and when you have specific ideas and you think you can help coordinate with us we will get back in touch with you and help you develop new simulators of your own. So, the idea finally is that MHRD will request AICTE or will require AICTE to have its engineering students take all of these experiments as substitutes for actual hardware experiments that you would have as part of a curriculum. So, in other words we are looking to develop a set of experiments which would match the AICTE syllabus of all your colleges. So, go into the content available at this site and get back in touch with us with any comments or feedback that you have for the experiments that we have developed. This is Kakinada Institute of Engineering and Technology. Is there a question? My name is Rukiran. I am from Githam University, Visakhapatnam. My question is that there are two kinds of journals which are available nowadays like the paid journals and the unpaid journals. So, the paper has been already published in the unpaid or a paid journal. Can we consider that as a source for the completion of the PhD? I missed the last part of your question. So, you are saying that you the question is about whether you have is that you have an article in a paid journal and you are asking whether that can be used to complete a PhD. I do not understand the connection between the paper being published in a paid journal and you are concerned about whether it can be used for a PhD or not. So, could you repeat and clarify your question? Yes, that is correct sir. Because the data could be different, right? Even the paid journals could accept the data which is an incorrect one also. So, again let me try and clarify this. Are you saying that you have some experimental results which you are trying to publish in a journal and you happen to have chosen a paid journal and you are worrying whether that will be accepted? Yes, that is correct. Yeah, paid journal will likely accept whatever it is you wish to publish with them. Is it any less likely to be realistic? There is no good study of this that I know of which says that publications from a paid journal are less reliable than publications from an open source journal. So, at this point I would refrain from saying that one is better than the other. There is an obvious advantage to publishing in an open source journal. Actually, there are two. One is of course, you are not paying. But second, there is a much larger community of people trying to publish in an open source journal and one would think that it is therefore, harder to get a publication in some open source journals than it is a paid journal. Therefore, a publication coming or therefore, an article that you publish in an open source journal, particularly some of the better ones is likely to be more prestigious over time out. Angadi Institute of Technology, Balgaon. So, I am having one query like you said we can use virtual labs for conducting an experiment may be for an FBE today. The question that I am going to ask is like, now what are the validations of these results that we have obtained from virtual labs can be used these results for publishing one more paper by comparing with other softwares and all. Okay, so I will try and repeat that question. The question had to do with the virtual labs that I just briefly described in my last slide. And the question is that if the virtual labs provide a simulator for generation of some results, could these results be in turn used to ultimately publish research type information? The answer is probably no, because what is being tried to be conveyed in these virtual labs, what is being kind of clarified in these virtual labs are the very simple basic concepts that most engineering students need. So, at this time we are not looking at very sophisticated simulators which you could possibly then use for research purposes as well. So, we are looking at simulating very simple elementary textbook level concepts in engineering, but because these are animations or interactive simulators, they give a student a better chance of understanding these concepts rather than simply studying these from a textbook. So, are you likely to get good quality output from such a simulator which is something you can take into research at this stage I would say no. How about using some other simulators like maybe some answers and all and comparing answers results with another statistical method can be published those kind of articles. So, the follow up question is about whether results derived from more sophisticated packages like answers can be used in a comparison with the other statistical packages. I am not sure why you are comparing results from answers to a statistical package, but the point of it is that answers is an acknowledged package in its domain and where it comes down to the scientific question that you are asking. I think you should be paying more emphasis to the scientific question rather than to the software package, but for that particular domain answers is considered one of the top packages. So, therefore results coming out of it assuming that they correctly address some hypothesis can be considered worthy of publication. So, thank you, Angadi. So, this is Vijay T. I. Mumbai. Sir, my question is this. While there are the two approaches, one is full factorial design of experiment and other is a fractional factorial design of experiment. Let us say when say factors are very large and levels are very large that means three factors let us say four levels around 81 treatments needs to be done then many people ask for the fractional factorial experiments like L9 array, L27 array. Whether the results obtained from the fractional factorial experiments are considered to be equally useful or equally reliable as compared to the full factorial experiment? You are asking the question is about factorial experiment design and whether factorial experiment design of one type is better than a factorial experiment design of another type. I have not introduced the concept of a factorial design for the larger audience, but I will address the question anyway as to whether one approach of design and therefore the conclusion from that design is superior to another design actually comes down to that phenomena that you are studying. So, since I do not know the context in which you are setting out to do a factorial design in the first place, you have got to look at the non-linearity in your models and therefore to what extent you really need to account for the multiple factor levels that you are going to practice a factorial design at that is the primary concern when it comes to doing a multi-level factorial design. So, really it comes down to be a domain specific problem and I do not think there is any recommendation that one can come up with as a general recommendation that one statistical approach is better than any statistical approach. So, at the heart of it you have basically got to set yourself up with the ability to predict at the end of the day and compare your two methodologies and then I end up asking is one superior to another and one of the things that we learn in statistics as you do the advanced theory is that there is usually no way to predict that there is one superior approach compared to another. So, in short my answer to you would be that you have got to probably if you think you have got two alternate ways of doing something you got to try out both and now it comes down to since it is factorial design that we are talking about it comes down to your tolerance for generating additional experimental results because factorial design is about minimizing the number of experiments you do to come to a particular final conclusion let us say about optimal operation of a particular process. So, not knowing the specifics of your particular problem I cannot address that in any further detail. So, if you have got a further question I will take that offline you can follow up on moodle. Okay, this is a SG SITS indoor. I am Saurabh Maru from Department of Pharmacy. I have question about reporting the negative results. As I come across many journals but I did not see any journalist who is just publishing all the negative results, but they at least need some positive along with that we can publish some negative result. So, is there any journalist there who is publishing some important or this sort of negative results only? Now, there is no journal that I know of which currently publishes negative results alone and that is actually a commercial and that is for a commercial reason because journals obviously try to make profits and they will make profits by publishing more research articles which promise positive results which by and large seem to be of more interest to the wider audience. So, the net result is a culture where we do not have a good forum for publication of negative results. So, the only way you can clarify your negative results is by actually coming up with technical reports of your own and publishing them on your own on your website. So, was it failed earlier also that we should start some journal which can report negative results or it should always be in reports or? No, the basic issue with having a journal which publishes negative results. So, the question is about whether one should have a journal which publishes negative results and my perspective on that is it is not a profitable activity to anybody who wants to come up with a journal which publishes only negative results. So, that is unfortunate which is why we default back to publishing on our own as technical reports in institutes. So, some of the bigger institutes will end up having an archive of technical reports that are generated from research within the institute. So, that is a good culture where all the data generated ends up being stored, otherwise you end up having to go through thesis and dissertations to try to access such negative results. Can you give me idea in India which institute is doing such type of archiving in of negative results or reports? I am not aware of Indian University is doing this. Malaraddy College of Engineering. Before going any PhD program it is better to implement any existing technique and take the results for and that can be extended to further or we can directly take the results from existing one and I can compare with that my my results. So, the question was whether you could work with data from the literature and use that in your research as opposed to collecting your own experimental results and then interpreting something from that. So, there is always a preference to doing an experiment on your own and that is because you are aware of all the design features of that experiment and you have been able to control all the possible issues which might mess up your experiment design. So, in other words you know the mistakes you have committed. There is a good chance that you do not know the mistakes that others have committed in carrying out these experiments and where possible and this comes down to whether your facilities and infrastructure permit this. It is ideal that you carry out your own experiments rather than going to the literature and pull out results. Of course, if it turns out you cannot do your own experiments then you are limited to working with literature results. So, thank you for that question.