R Statistical: KMeans Clustering Part 1 of 2

Loading...

Sign in or sign up now!
Alert icon
Upgrade to the latest Flash Player for improved playback performance. Upgrade now or more info.
5,201
Loading...
Alert icon
Sign in or sign up now!
Alert icon

Uploaded by on Nov 5, 2010

Unsupervised learning is a data mining technique that generates clusters of datapoints that share some similarities. In the real world applications there is no "right" label or classification for each observation. If there was, a supervised learning (classification algorithm) would be used. However, for the purposes of getting a feel for unsupervised learning it is useful to start with a dataset where you do have a sense for how you'd like it to be classified. Thus for this exercise we'll create our dataset from scratch.

In this part 1 of 2, we'll set up a fictitious dataset consisting of 3 groups, with 100 observations in each group (300 total). The dataset will have 3 attributes. One is generated using random deviates from the Gaussian distribution, one from the binomial distribution, and one from the Chi Square distributions.

In part 2 we'll apply kmeans clustering to this dataset.

Category:

Howto & Style

Tags:

License:

Standard YouTube License

  • likes, 1 dislikes

Link to this comment:

Share to:
see all

All Comments (1)

Sign In or Sign Up now to post a comment!
  • Sd is the value of the Standard Deviation and Not the standard distribution... But anyway: nice video!!!

Loading...

Alert icon
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more