 We're also bombarded with statistical data and with a single number of average. What is the average? We might be reading the newspaper or watching a news show, listening to the radio, looking at something on Facebook and they're talking about, here's the average. The average income, the average cost of a house, the average crime rate, the average weight of something, well that's just one single number, average. And there are so many ways that we might interpret what that average is. One thing that we want to look at perhaps is how is that data clustered around the average. That's something we're going to talk about in the next couple of lessons. And this one, we're going to think about, maybe you're going to apply for a job at a cupcake factory or a cupcake company. And you see an advertisement in the one ads that shows the average salary for this cupcake company is $43,000 a year. Well that seems like a pretty good salary to you and so you think you might go apply for this job, but they might not talk about in the ad or in the interview how the incomes are spread out. How is the data spread? Who's making close to $43,000 and how many people are? Or is this data skewed in some way? So why might this be useful? Think about as you're taking in information, how might the spread of data be useful to you as a consumer? So let's take a closer look at what those salaries might be to have an average income of $43,000 a year. All of these are listed as, these are in thousands. So at the low end we have a couple of $15,000 a year, some $18,000. Of course up here we have what the CEO might make and the people that are at the top of the company. And of course they're going to make more money. They've invested their money and resources and they're in the leadership positions and they're going to make more money than those at the bottom of the entry level. But how is the data spread out is a very good question. Notice that these are already listed in order so that's convenient. It's easier for us to see because I've listed these in order what the minimum and maximum salaries are. It gives us a quick glance how the data might look if we see, well, look how these incomes are listed. But there are a couple other ways that we could visualize more conveniently what the salaries are. A stem and leaf plot gives us a, it's pretty easy to construct and it gives us a quick glance to see how that data, how the incomes are distributed in groups of 10. So I'm going to list how many are making incomes that are between 10 and 20,000 or $19,999.99 and then how many are making in the 20s, in the 30,000s, in the 40,000s and so on. And my top income earner is $150,000 so I'll need to go all the way up to $15,000s. At the very bottom I have $15,000s. So now I'm going to list how many people are making something in the tens of thousands. There's one person that makes $15,000, another person that makes $15,000, another four people are making $18,000 a year. So I can see that there are six people that make in the tens. Okay. How about the 20s? There's someone who makes $20,000, two people make $25,000, one person makes $26,000 and one person makes $27,000. In the 30s, two people at $30,000, two people at $32,000, three people at $35,000. There's nobody who makes in the 40s so I'm not going to put any leaves for that stem. I have someone who makes $50,000, someone who makes $51,000. Nobody makes in the 60s or the 70s so I won't put any leaves on there. In the 80s we have someone who makes $80,000 and someone who makes $85,000. Someone who makes $95,000, someone who makes $99,000 and then we skip all the way to $150,000. So this kind of gives a visual of how if I'm reading in the paper an average salary of $43,000 a year, it doesn't show me how that data is clustered like I might look at a stem and leaf plot. Look at all of these people making an income in the 10s, 20s and 30s. But way down here are a few people making some more money. This is a convenient way to look at this because if I then turn it on its side later on when we look at a curve, I could see that the data is really skewed towards the lower income in that company. We'll look at that later. But a stem and leaf plot helps us see some of the distribution. Another very convenient way to look at the data is a box and whiskers plot. It's also pretty easy to construct. It's about finding a five number summary and called quartiles and we will take a look at what these values are and then graph our box and whiskers plot on the next slide. So the minimum value, what's the minimum income? Well again, the numbers are listed at least to greatest so the minimum value I can see is 15,000, what's the maximum income? 150,000. Now what's the median value? Well, these are conveniently lined up from smallest to greatest so what I want to do to find the median is which number is in the very center. The median value happens to be $30,000. The median is also known in this five number summary as the second quartile. You may or may not use the median of the data set to find the medium of the lower half of the data. It's commonly accepted not to use that as long as we're consistent in collecting our data. I don't think there's a set rule but we're going to not use that number 30 to find the median of the lower half of the data. So the median of the lower half of the data is right here between these two values and that is going to be at $19,000. The median of the upper half of the data is between these two values and that's at $50,500. So we have a minimum value, a maximum value and then how are the, how are these number sets divided between, in the incomes in between those? The median income or our minimum value was at 15. So I'm going to mark 15 as the very minimum, very minimum income and the maximum is 15,150,000. Our first quartile was at 19 and this is, I'm scaling this by 10. So if this is 15, our first quartile begins at 19,000, right, they're almost at 20. The median if you remember was at 30,000, that's the second quartile. And the third quartile was at about 51,000. So this, here's my box which will show how the middle data is clustered about the median. And then these whiskers show how far below that cluster and how far above the extreme values are. So if I look at this box in whisker plot, very simple to sketch, it gives me an idea about how the data is clustered when I read $43,000 in the newspaper as the average income. I see that some people are making much more than $43,000 and I might ask in my interview what is the range of salaries so that I could have some more information to learn about the incomes of that company.