In part 1 we set up a fictitious dataset. In this part 2 we used kmeans to come up with 3 clusters from the data. We then compare what it came up with as clusters against how we know we created the dataset.
By using the plot argument you should plot the first column of the data frame against the third column to show the "distances" between these points. Just by using the observations on the x-axis the red group is split up, but the values of the red group are still near to each other!
What version of R is this?
pierosraffa89 1 week ago
very informative video! can the errortable be used to generate a confusion matrix
persist911 6 months ago
Not the clustering is the problem but your synthetic dataset. You're having a big overlap in classes...
ricckli 1 year ago
By using the plot argument you should plot the first column of the data frame against the third column to show the "distances" between these points. Just by using the observations on the x-axis the red group is split up, but the values of the red group are still near to each other!
ricckli 1 year ago