 Hi, this is Dr. Don. I want to go over the first quiz in my stat lab with you and hopefully give you some ideas, some information that will help you on the midterm. This first question asks whether or not the data set is a population or a sample. And I think what you need to do when you're looking at problems like this is not read too much into what they're asking. Read the statement and don't add or subtract anything here. The statement is the age of each resident in an apartment building. The answer is that it's a population because there's a collection of ages for all people in the apartment building. Well, let's break that down. We know it's not a sample because the statement doesn't say we're taking a sample or we've got 25 of the residents in the building. It says each resident, so that means it cannot be a sample. That gives us two choices. We've got B, population because it's a subset, well that's a key word right there. A population is never a subset. And we're only talking about an apartment building. We don't talk about one of the buildings in the city. So that discounts option B, which gives us C. It's a collection of ages of all people in the apartment building. The second question is again about populations and samples. It says a polling organization contacts 1357 teenagers who are 13 to 17 years of age and live in the United States and asks whether or not they've attended a concert this past year. What is the population? Well the first option says teenagers who are 13 to 17 live in the United States have attended a concert. We know that's not correct because that was the question. Have they or have they not attended a concert? Second one is wrong for the same reason. It's got the question in there have attended a concert. The last one, teenagers who are 13 to 17 is not specific enough because the question says and live in the United States. So that means the population is churly, teenagers who are 13 to 17 and live in the United States. What is the sample? Well the sample is the 1357 teenagers who are 13 to 17 and live in the United States. Out of all the teenagers in the country, they sampled 1357. So that is the answer there. It says pretty straightforward. We contacted 1357. I'm going to reload that one and we're going to look at a similar question. Polling organization contacts over 2000 adults who are 20 to 90 years of age, live in Europe and ask whether or not they voted in the last federal election. So again that's the question, the last part of the stem. Did they or didn't they not vote? And so the population are the adults 20 to 90 who live in Europe. And then we find that one down here. The sample is the 2218 adults contacted who are 20 to 90 and live in Europe. So I hope that helps. The next question asks about whether something is a statistic or a parameter. And I like to use the mnemonic device. Student is to parameter as sample is statistic. So let's look. A sample of employees is selected and found that 45% on a computer. It says a sample. Therefore that has to be a statistic. Let's look at another. Here it says a sample of students selected and found that 45%. If it's a sample and they've got a quantification there that's got to be a statistic. Let me look at another one. A sample of students and found 55%, another statistic. In a study of all seniors at a college it is found that 50% on a television. All implies population. Population says it's got to be a parameter. So I hope that helps. Population parameter, sample statistic. This next problem is about samples and how we make inferences about the population, the targeted population based on what we find in the sample. And here we're given a sample from a targeted population that shows that professional basketball players are taller than people who are not professional basketball players. So from our sample we could infer that professional basketball players are taller than people who are not since that's what the sample showed. But how can this inference be wrong? Well, just as with any sample it may not capture everything about the targeted population. Therefore the inference that the professional basketball players would be taller than people who are not professional basketball players. Let's look at another example and this one it says the sample from a targeted population shows that students in first grade are shorter than students in fourth grade. And again the whole purpose of getting a sample is to make an inference about the population. In this case we can say that students in first grade are shorter than students in fourth grade in our targeted population. In these other things students have not taken the vitamins we don't think about that. We don't know that fourth grade students are shorter our sample shows just the opposite and students in first grade may be younger but we didn't find out their ages. So that's not the inference we can make. And of course any inference we have can be wrong. In this case they've reworded the inference a little bit and you've got to read it very carefully it says the inference may incorrectly imply that if you switched from fourth grade to first grade you would be shorter than you would be if you were in fourth grade. And of course that's ridiculous that inference is totally wrong. So just read these very carefully. This next question has to do with the type of data which people seem to get correct pretty well but they miss the level of measurement. Here we have data in degree centigrade of air samples and that looks pretty quantitative to me. So quantitative obviously is the first part. What is the data set level of measurement? Well remember in order to be ratio the highest level you've got to have a true zero. Here degrees centigrade are Celsius just like degrees Fahrenheit. There is no true zero. So you can't really say that one temperature is double another. The only temperature scale with a true zero is Kelvin which is not. Therefore this cannot be ratio. The next level down is interval and that's what this data is. It's not just ordinal because we can subtract values from one another to know that something is 20 degrees warmer than something else. So it's more than just the order and of course it's not nominal. Let's look at another one here. This is inches and length of a species of fish therefore again it's quantitative because links can have zero length true zero therefore this would be ratio instead of ordinal. So I hope that helps. The next question is also about levels of measurement and we're given allergies, temperature, age and happiness from a scale from zero to ten. What is the measurement for allergies? So I'm guessing they're going to answer I have allergic to grass, I'm allergic to peanuts so that would be nominal. What is the level of measurement for age? Age is time and that can be kind of fuzzy but there can be a zero age so age would be ratio. What is level of measurement for temperature? Temperature again unless it's Kelvin generally when we take human temperatures it's going to be in Fahrenheit or Celsius so there's no true zero therefore it's interval which is one scale down from ratio. The last what is the level of measurement for happiness? Scales zero to ten and here we don't really know if the interval from zero to one and one to two is the same. We don't generally assume it is but we don't know so really it's just order. We are happy, we're more happy, we're very very happy just order. The next question is about sampling techniques and we're supposed to discuss potential sources of bias and we assume the population of interest is a student body at a university. We question students as they leave a student union building and the researcher asks 338 students about their dating habits. So what is wrong with that sample? What kind of sample is it? Well it's a convenient sample. The researcher is just outside the student union and as students come out he asks will you answer some questions? They may not answer but it's just convenience it's not cluster there's no rhyme or reason to what he's doing he's just taking the easiest one so it's not cluster it's not simple random because he's taking students as they come out they're not randomly chosen it's not stratified it's not systematic sampling. What potential sources of bias are present? Well university students may not be representative of all people in their age group but the population of interest is a student body therefore they are representative of the student body of university. There are no potential sources of bias so that's not true because it's a convenient sample in a convenient sample it's only the people you're easy to get therefore they will not necessarily be representative of the target population and because it's a personal nature in the question they may not answer or they may not answer honestly so those are two forms of bias this is a similar question a little bit different in 1965 researchers used random digit dialing to call 800 people and ask what obstacles kept them from eating healthier what type of sampling well it wasn't cluster it didn't tell us anything about a design here so it's not systematic it's not cluster but it's not convenient because it is a random digit dialing so at least you're getting a simple random sample what kind of bias it says the sample only consists of the members population the easy to get well that's not true it's a random sample of people that have telephones so let's look down here further telephone sampling only includes people who have telephones people who own telephones may have been older wealthier just different and therefore not representative so that's definitely a bias problem and again because it's a personal question they may or may not answer the question or answer the question truthfully so there's some bias there and finally it's only individuals who were there to answer so again people may have different work schedules than the period that the person was calling therefore there's some bias there definitely there are potential sources of bias so that's not a good answer so hope that helps this question talks about the design of survey questions and how they can be leading which would lead to bias results the question is why is eating cake bad for you well the just the fact that they say why is it bad for you would lead you to think maybe it's bad for me so it's definitely a biased question then how can we make it better do you think the eating cake is good for you well that's just flip side putting it good for you would make people think well maybe it's good for me do you think kidding is eating cake is bad for you that's saying essentially as we have there why is eating cake good for you that's the flip side original question is not bias well that's not true how do you think eating cake affects your health that doesn't lead you to say it's good or bad so of all the choices that's the best question let's look at another one here why is eating carrots good for you well that's the same question let's look one more real quick and see why is eating eggs good for you I think all of these are going to be yeah they're all going to be the same type of questions so I can't really just read the question see if it leads you to lean one way or the other and if it makes you lean one way or the other then it's biased