C1: WHAT IS STATISTICS?

Loading...

Sign in or sign up now!
Alert icon
Upgrade to the latest Flash Player for improved playback performance. Upgrade now or more info.
8,890
Loading...
Alert icon
Sign in or sign up now!
Alert icon

Uploaded by on Oct 23, 2010

WHAT IS STATISTICS?
o The mathematics of the collection, organization, and interpretation of numerical data, especially the analysis of population characteristics by inference from sampling
o The subject of statistics can be divided into descriptive statistics - describing data, and inferential Statistics - drawing conclusions from data (Source: dictionary.com)

WHY SHOULD WE STUDY STATISTICS?
Descriptive Statistics : To describe a phenomenon
o Summary and presentation of data

Inferential Statistics: To draw conclusions
o Making statements or predictions about the population based on statistical information


POPULATION & SAMPLE
POPULATION: is the group of all objects or individuals of interest.
o All York Students
o Canadians
SAMPLE: is a subset of the population
o 40 York students chosen at random
o People interviewed for the latest election poll
o We refer to the individual components of a sample as "observations"

PARAMETERS AND STATISTICS
Very generally we can say that:
o Populations are described by PARAMETERS
o Samples are described by STATISTICS

For example:
Parameter: the average hair length of all domestic cats (reflects the true value for the population)
Statistic: the average hair length of cats in my sample (it's an estimate)

Statistical inference: is the process of drawing a conclusion about the population based on the sample (with certain levels of confidence and significance)

FINAL DEFINITIONS
A variable is a characteristic of a population or sample.
o student grades, height, income, etc.
Variables have values
o student marks (0..100)
Data are the observed values of a variable.
o student marks: {67, 74, 71, 83, 93, 55, 48}

ATTAINING THE DATA
We have a phenomenon of interest and we would like to collect data to study it further
o We can directly collect the data: this is called PRIMARY DATA.
o We can use data collected by others (e.g. Statistics Canada; market research companies; etc.): this is called SECONDARY DATA
o
HOW DO WE COLLECT PRIMARY DATA?
1. By observations
2. By experiment
3. By survey
The difference is generally in the amount of control exercised by the researcher and the strength of the inference that can be made

DECISIONS INVOLVED IN SAMPLING
Sample Population
o From which population do we sample?
o Why is this important? What do we have to consider?
Sample Size
o How large should the sample be?
Sampling Method
o How should we pick the sample out of the population?

SAMPLE SIZE DEPENDS ON
o The size of the population
The sample size will INCREASE with the population size

o The variation in the population
The sample size will INCREASE with the variation

o The amount of error that can be tolerated
The sample size will DECREASE with the accepted error

o The amount of resources available
The sample size will INCREASE with resources

HOW TO CREATE THE SAMPLE
There are several statistical sampling methods you can use:
1. Simple Random Sample
2. Stratified Random Sample
3. Cluster Sample

SIMPLE RANDOM SAMPLE (SRS)
Each subject is equally likely to be chosen
o Like raffles, drawing from a hat, etc.
o Subject choice is determined by random numbers

STRATIFIED RANDOM SAMPLE
The population is divided into mutually exclusive subgroups called strata
o i.e. age, gender, home type
Within strata, the sampling is random (simple)
Advantages: Assures the sample has the same structure as the population
Inferences can also be made about the subcategories

CLUSTER SAMPLING
The population is divided into groups, called clusters
Geographical regions, classrooms in a school
Each clusters ideally has the same characteristics as the population
We use simple random sampling to select only a few clusters
We then use either simple random or stratified sampling within each cluster

SAMPLING ERRORS
A sampling error refers to the difference between the sample statistic and the population parameter
Example: survey shows 51% of students work when in fact only 50.42% work
We will learn how to deal with this error in later classes

NON-SAMPLING ERRORS
A non-sampling Error refers to errors in data acquisition Inaccuracies & mistakes; less-than-truthful responses
Non-response Bias: only people with a certain agenda respond to the survey
Selection bias: sampling problems

Category:

Education

Tags:

License:

Standard YouTube License

  • likes, 1 dislikes

Link to this comment:

Share to:

Uploader Comments (SEEK0HELP0HERE)

  • Thanks guys! 

see all

All Comments (9)

Sign In or Sign Up now to post a comment!
  • So easy a caveman could do it

  • Good job dude.

  • Thanks for video keep sharing!

  • Nice Video :-)

  • lol "pheno- i dont know how to say that word"

    :) good vid. taking it next year

  • Very good...i think im going to use this for my blog...I'm a mathematics student and i need to make a blog for a project for class!

  • Thank you very much for clearing the statistical concepts and erasing the misconception about the presented terms. so thank you very much once again.

  • Great video, thanks for your help

Loading...
Alert icon
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more