 Se on tärkeää, että tutkimukset ovat ympäristöintiä ympäristöintiä. Tämä on myös tärkeää, että voidaan sanoa ympäristöintiä, kuten ympäristöintiä on. Joten, miten on tärkeää, että voimme sanoa ympäristöintiä, ja miten on tärkeää, että ympäristöintiä on ympäristöintiä. Seuraavaan, kun käydään, on yksi asia, joka haluamme asiaa, kuten suunnitelmaa. Ympäristöintiä, jossa on ympäristöintiä ja ympäristöintiä, on tärkeää, että jos on sivuinen, että se on ympäristöintiä, niin kaikki on okei. Jos ympäristöintiä on ympäristöintiä, niin asia on valmis. Joten tämä on ympäristöintiä, joka ei ole edellinen. Ihmä inna, että en voi Really justify that kind of yes or no, based on any good methodological resource. Meneet auttos teemme jo Nunollis' book on psychometric theory for the rule of thumb of 0.7 for Chrombax Alpha which is a coefficient alpha, which is reliability statistic. Well, the problem is that he doesn't make that kind of claims in his book, Ensin, reliibilit on jotain, jota pitäisi muistaa. Jos studiivit ovat 80 %-reliiville, jos on aika, niin ehkä 80 % ei ole aika. Sitä pitää sanoa, mitä se on. Mitä biisi sinä pitäis, jos on 70 %-reliiville? Mitä biisi sinä pitäis, jos on 95 %-reliiville? On se todella ei ole? Ei se ole, että se on se, että on yksi seurana, Ohjelmit Survey purchases in Steady. Is the matter of understanding what reliability means for your results and then explaining that to your readers. So, before we talk about these actual statistics it is important to understand what kind of assumptions that the reliability statistic are based on. And what's the principle of assessing reliability. With a bathroom scale example Mennästää the same person again with the same scale, if you get the same result, then your measure is reliable. When we measure people or organizations through surveying people, for example, things are a bit more complicated. The reason is that if we ask a person whether they like, for example, United Nations, then and we ask the person again if they like United Nations. The second answer to the question is influenced by the previous answer. So if we ask the person same question over and over, they will give us the same answer because that's how they answered the last time. So whereas a bathroom scale doesn't remember what the previous measure was, people do, and that's a problem. So classical test theory has this concept of parallel tests. The idea of a parallel test is a hypothetical scenario where we would measure the same person again without that person having any recollection of the previous measurement of case. So an example here is that if we ask Mr. Brown whether he likes United Nations or not, then if we ask him the same question again, we have to brainwash Mr. Brown in between those two questions so they are really independent tests of the same attribute. This of course is a counterfactual argument because we cannot brainwash our subjects. Our subjects will know what they answered the last time. So if we ask the survey question, we ask the next question, how the person answers the next question will be influenced by how they answered the first question. So we simply cannot ask the same question over and over. There are two workarounds for this problem that we really cannot do these parallel tests. Test the same attribute of the same person at the same occasion without the person having any recollection of being tested before. The two ways are we either do actual replications and we assume that they are parallel. So that will work if we have a time delay. For example, if we ask a person now whether they like United Nations, we ask them the same question a week after, then they may not remember anymore what the original answer was, in which case we could argue that those repeated measures mimic the parallel tests scenario. Another way is to assume that we do two distinct measures. So we measure the same attribute in a different way and we assume that those two different ways of measuring the same thing are parallel. So instead of asking the person whether he likes United Nations or not, we could ask him whether he thinks that United Nations is the best thing that has ever happened to mankind, for example. So we measure the same thing again but slightly differently. So with that way we could say that the second way, the second measurement is not as much influenced by the first measurement as it would be if we just repeated the same question over and over. The first approach by repeating the exact measurement again with a time delay is called test-retest reliability. So the idea is that if the attribute that we are measuring is relatively stable over time, then if a person answers or tests differently at a different occasion, then the only reason for the difference between the two tests is unreliability because the trait is stable. And also we have to make the assumption that errors are independent, which is justified by the time delay. So you don't remember what you answered the last time because there's a time delay. And if we, an example here would be that if we measure a child that we goes, if the measurements are done in a matter of seconds, the true way does not change. We cannot argue test-retest reliability in that case with, for example, one year's time delay. So we can't measure a child at five years and a child at six years and then say that the weights from those two measurements differ. That is evidence of unreliability. That would not be valid evidence of unreliability because we cannot assume that the trait is stable over such long period. So you have to consider how quickly the trait or the thing that is being measured changes over time, and how quickly people reset by forgetting that they were tested and how exactly they answered the question in the first place. So that's test-retest reliability. Let's take a look at example of test-retest reliability from Yliorengos paper. They say that they are asked slightly different question again with a two-year delay on the key constructs. And these, the study was about small companies and the two-year delay of course here, for that to be valid you would have to assume that nothing changes in small companies in two years time. That is of course not a valid assumption. So we can't make the assumption here that the trait doesn't change. So this would not be a valid test-retest. It would be a valid if you do a survey of a business organization if there is like a two-week or a month time delay. Then you could reasonably assume that there are no major changes. If you have a two-year delay like in this paper, then that is not a very good test-retest reliability estimate. So test-retest is you measure the same thing again with a time delay that is appropriate for your measure and the trait being measured. So that it allows people to reset between measurements but the trait doesn't really change substantially between the measurements. So this is not as commonly used because of course if you do two rounds of a survey study, it is more expensive than to do just one round of a survey study. So we actually use more commonly another way which is the distinct test. The reason for having multiple survey questions that look the same or look like they would measure the same thing is the reason for that is that we actually think that they are distinct tests. So that's the most common reason for using multiple survey questions to measure the same thing. For example we could ask the company, the person to rate whether the company is innovative or not, whether they are the technological leaders in the industry and whether they are the first one to bring new product concepts to markets. We could argue that these are distinct questions so they are, you don't answer the second question similarly to the first question because these are really different questions but they do measure the same trait. That's the argument we have to make. So the idea of distinct tests is that we generate tests that are not the same. So they're sufficiently different but we could still argue that they all measure the same thing. And how we use that data from these multiple distinct tests produces different ways of assessing reliability. So there is an internal consistent method, alternative forms method and split health method. Understanding exactly what these all do is not important. It is important to understand the principle and then understand a couple of statistics that you can calculate from the data and then understand their interpretations. The really important part here is that the tests really have to be distinct. So if you're just asking the same question over and over with slightly different wording, for example our firm is very innovative, our company is very innovative and our business organization is very innovative. These are not distinct tests. It's just asking the same question over and over with slightly different wording. And this is something that you see very commonly as a reviewer. So authors are just writing questions that are the same without paying much attention to the distinctiveness of these questions. And that's a big problem that I see in management research.