 Statistics and Excel. Calories, data, statistics, sample, example. Got data? Let's get stuck into it with statistics and Excel. Well actually we're using OneNote here but we won't be talking about Excel too. You're not required to but if you have access to OneNote icon left-hand side OneNote presentation 1360. Calories, data, statistics, sample, example, tab. We're also uploading transcripts to OneNote so you can use the immersive reader tool. Change the language if you so choose. Either read or listen to the transcripts and multiple different languages using the time stamps to tie in to the video presentation. OneNote desktop version here. Data on the left-hand side related to counting calories. We have the date on the left, the calorie count on the right. So for example this first one 312, 2016, calorie count 2,990, 312, 16, calorie count 1777 and then 313, 16, calorie count 2480. Now we're going to treat this data on the left-hand side as though it's the entire population of data so that we can run a few statistical analysis on it and then imagine that we're going to take samples from this data set so that we can then run statistical tests on the sample to see if that information is something that can tell us about the entire population. So similar type of strategy here that we have done when we looked at the heights of individual but we're going to use a little bit different methods when we get to the sampling and our goal is to think about the statistics involved as well as how we might use tools such as Excel to help us practice with these concepts. Also just realize that if you want to look up some of these data sets Kaggle.com, K-A-G-G-L-E.com might be a place to look. Let's start off by taking the information from the entire data set so this is the population data set. We can calculate the average or mean which is going to be taking the entire sum of the number adding up all of the numbers and then dividing by the count one two three four and so on or we can use the average function which is average and then we just sum up the data or average in this set of data and that gives us the two one eight nine. We might also take the median which is picking the one in the middle so if we listed this from top to bottom lowest to highest or highest to lowest and then pick the one in the middle that would be the median just like Rocky's the boxers coach told him to when he sees three of them out there hit the one in the middle hit the one in the middle the max is the highest one so if we were to sort the column over here and pick the the highest amount that's the maximum we don't even have to sort them though because we have the formula of equals max to pick the max and then the min is the lowest one so we could sort by the lowest one to see the min or I can simply use my min formula so we had zero calories out we were locked in a closet one day or something I don't know that's not sure that's exactly healthy we're fasting it's just one day not a big deal all right so then if we were to take this data and then put it into a histogram so we just select this entire data set make a histogram here's from the categories of zero to three hundred and seventy from three seventy to seven forty and so on and the and it looks like kind of like the middle or biggest area where most of the results are falling in is between one thousand eight fifty and two thousand two hundred and twenty now calories is another one of those areas where you would kind of expect because we tend to stay at a similar weight you know a similar range between a few pounds so that you would expect that our calories would also be within a pretty reasonable range so this is another one of those areas where you would expect most days your calorie counts are pretty much in a range and then it would look kind of bell shapes you would think that would be higher or lower on on certain days that's kind of so we don't have as extensive a data set here as we had with the heights and therefore we don't have as much detail that you might expect if we had a whole whole lot of data but we're going to assume this is basically our entire population so then we're going to think about how we can create a sample of that population so if i'm going to create a sample what i want to do is take these numbers and in essence shuffle them up i want to shuffle up those numbers so once again we're going to use the technique of using the random number generator so the random number generator is this one just equals random and if you just put a random number it's going to use a decimal i've then added the decimal so it's a pretty long decimal so all of these randomly generated numbers should be unique and therefore if we sort them they will be sorted you know in a unit they'll shuffle the sorting so if we add the calories so now i've added all of the calories and these random numbers