The first 40 min or so is about theoretical bounds for the number of training data required. The more interesting and practical stuff is after that, where he presents practical issues on applying classifiers to real world problems. Here is my quick summary:
40:10 - Model selection, avoiding under/over fitting, Hold out cross validation
@nghiaho12 I don't get one thing. He sayd VC dimension bounds are lose so he doesn't use them for model selection. But then he says that SVMs are the best machine learning algorithm, when the very thing that is supposed to make them better is the fact that they are based on he idea of minimizing those theoretical bounds. Isn't that kind of contradictory?
@darfunkelidas I actually got a bit lost in the theoretical stuff, kinda dozed off so can't help you there :) As a side comment. I have used SVM briefly and have found them to be very slow in training and classification, using libsvm. It seems to pick way too many support vectors (in the 100s and 1000s) in general (at least for my problems), which slows down classification considerably. This issue wasn't discussed in the 2011 online Machine Learning course, which he should.
@nghiaho12 I know, VC theory is very complex! SVM's are usually fast (depending on your speed standards!) but I guess you took time optimizing the parameters for the kernel and error penalty, etc? What was the size of your dataset? If its a large dataset then I guess it's "notmal" that its picking up lots of SV.
@darfunkelidas I was thinking of logistic regression and decision trees. Both are blazingly fast compared to SVM, to train and predict, and great for quick evaluation when you are frequently changing features :) Libsvm recommends you do cross-validation to find good parameters (C and gamma for Gaussian), which for me was painfully slow to watch. My data was in the 10,000s with dimensions around 12 maybe, don't have the dataset on my computer so can't give definitiive numbers.
This has been flagged as spam show
Kind of reminds me of Turbo C/C++.
grunder20 1 month ago
This has been flagged as spam show
More power to Prof. Andrew!
grunder20 2 months ago
The first 40 min or so is about theoretical bounds for the number of training data required. The more interesting and practical stuff is after that, where he presents practical issues on applying classifiers to real world problems. Here is my quick summary:
40:10 - Model selection, avoiding under/over fitting, Hold out cross validation
44:40 - K-fold cross validation
53:25 - Rule of thumb for number of training set vs number of features, for logistic regression
54:30 - Feature selection
nghiaho12 4 months ago 4
@nghiaho12 I don't get one thing. He sayd VC dimension bounds are lose so he doesn't use them for model selection. But then he says that SVMs are the best machine learning algorithm, when the very thing that is supposed to make them better is the fact that they are based on he idea of minimizing those theoretical bounds. Isn't that kind of contradictory?
darfunkelidas 3 weeks ago
@darfunkelidas I actually got a bit lost in the theoretical stuff, kinda dozed off so can't help you there :) As a side comment. I have used SVM briefly and have found them to be very slow in training and classification, using libsvm. It seems to pick way too many support vectors (in the 100s and 1000s) in general (at least for my problems), which slows down classification considerably. This issue wasn't discussed in the 2011 online Machine Learning course, which he should.
nghiaho12 2 weeks ago
@nghiaho12 I know, VC theory is very complex! SVM's are usually fast (depending on your speed standards!) but I guess you took time optimizing the parameters for the kernel and error penalty, etc? What was the size of your dataset? If its a large dataset then I guess it's "notmal" that its picking up lots of SV.
darfunkelidas 1 week ago
@darfunkelidas I was thinking of logistic regression and decision trees. Both are blazingly fast compared to SVM, to train and predict, and great for quick evaluation when you are frequently changing features :) Libsvm recommends you do cross-validation to find good parameters (C and gamma for Gaussian), which for me was painfully slow to watch. My data was in the 10,000s with dimensions around 12 maybe, don't have the dataset on my computer so can't give definitiive numbers.
nghiaho12 1 week ago
Comment removed
nghiaho12 4 months ago
These lectures are excellent. I appreciate the clarity of his concepts.
rahulsevakula31 6 months ago
Comment removed
rahulsevakula31 6 months ago
SHATTERED! Pretty brutal math term...
Veered207952 6 months ago
|>
| << flag. Im here
szproxy 7 months ago
i love how everyone (in the comments) disappeared after the support vector lecture :)
VancouverData 8 months ago 2
@VancouverData yea, fair weather machine learning fans!
handdancin 7 months ago in playlist Course | Machine Learning
@VancouverData: Your statement is probably approximately correct... But yeah, I'm sure many didn't realize how hard the class is.
MegaCrazyTaxiDriver 6 months ago
This has been flagged as spam show
excellent work!
1888junkteam 2 years ago