Added: 3 years ago
From: StanfordUniversity
Views: 17,341
Sort by time | Sort by thread (beta)

Link to this comment:

Share to:
see all

All Comments (16)

Sign In or Sign Up now to post a comment!
  • The first 40 min or so is about theoretical bounds for the number of training data required. The more interesting and practical stuff is after that, where he presents practical issues on applying classifiers to real world problems. Here is my quick summary:

    40:10 - Model selection, avoiding under/over fitting, Hold out cross validation

    44:40 - K-fold cross validation

    53:25 - Rule of thumb for number of training set vs number of features, for logistic regression

    54:30 - Feature selection

  • @nghiaho12 I don't get one thing. He sayd VC dimension bounds are lose so he doesn't use them for model selection. But then he says that SVMs are the best machine learning algorithm, when the very thing that is supposed to make them better is the fact that they are based on he idea of minimizing those theoretical bounds. Isn't that kind of contradictory?

  • @darfunkelidas I actually got a bit lost in the theoretical stuff, kinda dozed off so can't help you there :) As a side comment. I have used SVM briefly and have found them to be very slow in training and classification, using libsvm. It seems to pick way too many support vectors (in the 100s and 1000s) in general (at least for my problems), which slows down classification considerably. This issue wasn't discussed in the 2011 online Machine Learning course, which he should.

  • @nghiaho12 I know, VC theory is very complex! SVM's are usually fast (depending on your speed standards!) but I guess you took time optimizing the parameters for the kernel and error penalty, etc? What was the size of your dataset? If its a large dataset then I guess it's "notmal" that its picking up lots of SV.

  • @darfunkelidas I was thinking of logistic regression and decision trees. Both are blazingly fast compared to SVM, to train and predict, and great for quick evaluation when you are frequently changing features :) Libsvm recommends you do cross-validation to find good parameters (C and gamma for Gaussian), which for me was painfully slow to watch. My data was in the 10,000s with dimensions around 12 maybe, don't have the dataset on my computer so can't give definitiive numbers.

  • Comment removed

  • These lectures are excellent. I appreciate the clarity of his concepts.

  • Comment removed

  • SHATTERED! Pretty brutal math term...

  • |>

    | << flag. Im here

  • i love how everyone (in the comments) disappeared after the support vector lecture :)

  • @VancouverData yea, fair weather machine learning fans!

  • @VancouverData: Your statement is probably approximately correct... But yeah, I'm sure many didn't realize how hard the class is.

Loading...
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more