We continue our work with sentiment analysis from Lecture 2. I go over common ways of preprocessing text in Machine Learning: n-grams, stemming, stop words, wordnet, and part of speech tagging. In part 2 I introduce a common approach to k-nearest neighbor classification with text (It is very similar to something called the vector space model with tf-idf encoding and cosine distance)
Code and other helpful links:
http://karpathy.ca/mlsite/lecture3.php
Andrej does an _excellent_ job of motivating us as to why log(x) would be better. I have not come across a better explanation of why log works as a "squasher" function.
Wish he'd make more videos!
oneparagraphanswers 2 months ago
waiting for the next lecture... :)
koolsatan 4 months ago
very nice and simple explanation. I hope for more videos from you, thanks very much for this video
ashamrad 5 months ago
ty. waiting for your SVM video.
chetan528 7 months ago
very nice explanation of tf-idf, thanks
avneetchugh 7 months ago
This is one of the best video I have seen so far and clear and concise also . Can't wait for your SVM videos. Hope you post them soon. Cheers
persist911 8 months ago
Good pace and easy to understand. Thanks!
kwilliaa 9 months ago