the calculus part is easy enough but he skipped most of the linear algebra details which is a big part of his math. So... you can go to MIT web site for another FREE education from Prof. Strang and the final equation will become very clear.
(see prev comment 1st) I often find these leaps in the math used by engineers to "prove" things work. I don't think it's just my lack of background either - he shows the operations he's using, then expects you to take it as read that the math really does work, and expects you to be able to follow right away. You don't usually need to understand the proofs to pass the course though, or to use the results. It's just not as satisfying.
That proof at the end (from about 1hr - 1hr 15min) is hard to follow. If you download the lecture notes from the site, it's written down in very few lines. I had to really work to put it all together. It's taken me about an hour to understand it, and even then I'm not fully satisfied that I have proven it to myself. He makes many leaps that are not obvious to me.
This lecture is interesting, but I don't get the math at all. What kind of math is this, and what should I read in order to understand this (my background is linguistics, so I have no math training). Is it enough to learn calculus? And is it possible to really learn math at an older age, or is it like playing violin, you have to start at a very young age in order to become professional?
@astroboomboy on the course website (google it) it says you need linear algebra and probability theory, but it said you need basic linear algebra and probability and a little programming experience.
@astroboomboy Mainly calculus and linear algebra, you may pick up the two in 2-3 months if you're intent on learning as they're usually freshman level courses and have no pre-requisites themselves (you may also learn them concurrently as they are independent on the basic level and will intertwine easily as necessary). For Calculus, I recommend the James Stewart textbook, as for linear algebra, I recommend the text by Otto Bretscher. Both are illustrative, thorough and easy to follow.
hmm.. so alvin looks at the road ahead and records the steering direction. So what if the road ahead is a curve but since I'm on a straight patch for the moment my steering direction is still straight? Seeing where the cam was placed and that there was no bonnet in the pictures it must have been calculated for a few metres ahead. Does that affect anything? In the video it seems like Alvins response is about 0.5 seconds behind a typical human response. Specially in the live tests
This is a very good intro course to AI! It's unfortunately easier to say from the outside that this is "something awesome" than if you had to take that class at Stanford in that classroom, and your work was graded in THAT class.
I was completely ecstatic to learn that Stanford offered an entire introductory machine learning course online for free. And by completely ecstatic, I mean that, literally, I was jumping up and down in my chair. I can't wait to work through the exercises over the course of this summer!! Thank you so much Professor Ng, and thank you so much Stanford for affording me and others such an amazing opportunity!
I think Andrew missed the point of the last question, namely that (X^T X)^(-1) X^T indeed is a formula for the pseudoinverse of X (when m > n and X^T X is invertible).
Uhm. It's really not necessary to spend that much time on arriving at the normal equations... Basic linear algebra will get you there very quickly. Just reason about the subspaces of the matrix X (e.g. the error vector should be perpendicular to the column space of A => the error is in the left nullspace => done).
I also think that the conventional way to do OLS regression is simpler compared to doing a gradient descendant method. But perhaps the point here is the gradient method, not regression (so that they can use this method later for more complex models)
@matharoofmaths; Yes ... and that's why he makes so many mistakes in this lectures and has a hard time answering his student's questions (and occassionally evades student questions) in later lectures ... but if his research papers are any indication, he will definitely be an outstanding teacher in the future.
All criticism aside, this is much better than what we had before - nothing. Thank you Dr. Ng and Stanford for letting us in. This is making Machine Learning that much more accessible.
Wonderful lectures. I was a Stanford undergrad in 1969-1973 and this is so much better than what we had to deal with before. Not so much that the teachers are better (though Andrew Ng is great), but that the pedagogical tools are so advanced.
it's just like study in Stanford! Although it is not physically , but i really let me gain more knowledge of machine learning that only from my university. And he is really a good lecturer!
thank you for you guys that propose it to the Standford University and upload it!
Could someone explain how to get Gradient tr ABAC^T=CAB+C^TAB^T? I can't see how you can get an addition on the right hand side. At least not from within the rules he described in the lecture. Could one use the chain rule for derivation?
@juludd It works like this: there are two A in ABA^TC, so when we take gradient, we gotta an addition of two terms, Gradient trABA^TC=(BA^TC)^T+gradient trCABA^T. The last term is the gradient of A^T. Continue, =C^TAB+(gradient trCABA)^T=C^TAB+((CAB)^T)^T=C^TAB+CAB
I am so interested in machine learning,I can program in java and I have MatLab student version. What previous Math would I need to better understand the tutorial?
@Fusionicon Basic Calculus. Other than the weird stats stuff he brings into play when formulating the error function ("J"), you don't need anything else, so long as you really pay close attention.
@caesiume the true Stanford Education happens on campus, espacially for PhD research. I mean you have three to five visitors every week to present some of best research work. ever made. I regret that I did not get my PhD from Stanford. If you are a PhD student just move to MIT or Stanford, the rest is junk.
Around time = 43:00, Dr. Ng again gave the wrong description of the gradient.
Example: Let f(x,y) = x^2 + y^2. Hence, the gradient is ( 2x, 2y ). At the point (1,1), the gradient is (2,2). Since the only local minimum of f(x,y) is at (0,0) and since (1,1)+(2,2)=(3,3), then the gradient at (1,1) points away from the only local minimum of f(x,y); therefore, the gradient does not point toward the direction of steepest descent. The gradient points in the direction of steepest ASCENT.
Around time = 28:00, Dr. Ng noted that to go in the direction of steepest descent from a point, ( theta1, theta2, J(theta1, theta2) ), we should go in the direction of the gradient of J at that point; however, this is incorrect. The gradient always points in the direction of steepest ascent, not descent; therefore, the direction of steepest descent from ( theta1, theta2, J(theta1, theta2) ) is opposite of the gradient: -Del( J( theta1, theta2 ) ).
Around time = 28:00, Dr. Ng that if we want to go in the direction of steepest descent from a point J( theta1, theta2 ), then we should go in the direction of the gradient of J( theta1, theta2 ); however, this is incorrect. The gradient always points toward the direction of steepest ascent, not descent; therefore, if we want to go in the direction of steepest descent from a point J( theta1, theta12 ), then we should go in the direction that is opposite of the gradient ... -J( theta1, theta2 ).
15:10
hylandsjgcn 16 hours ago
kindly check the link.. i can't view the video.. it says some error has occured for the past two days..
rahulsanal1 1 month ago
should add trace to x'x to make it equal to \sum X_{ii}
wpeng001 1 month ago in playlist Course | Machine Learning
I know matrices But i was lost after he was talking about proving it.
ByThe1Way 1 month ago
the calculus part is easy enough but he skipped most of the linear algebra details which is a big part of his math. So... you can go to MIT web site for another FREE education from Prof. Strang and the final equation will become very clear.
harrycook111 3 months ago
watch after 9:00
shrit110 3 months ago
(see prev comment 1st) I often find these leaps in the math used by engineers to "prove" things work. I don't think it's just my lack of background either - he shows the operations he's using, then expects you to take it as read that the math really does work, and expects you to be able to follow right away. You don't usually need to understand the proofs to pass the course though, or to use the results. It's just not as satisfying.
robertxxx74 3 months ago
That proof at the end (from about 1hr - 1hr 15min) is hard to follow. If you download the lecture notes from the site, it's written down in very few lines. I had to really work to put it all together. It's taken me about an hour to understand it, and even then I'm not fully satisfied that I have proven it to myself. He makes many leaps that are not obvious to me.
robertxxx74 3 months ago
This lecture is interesting, but I don't get the math at all. What kind of math is this, and what should I read in order to understand this (my background is linguistics, so I have no math training). Is it enough to learn calculus? And is it possible to really learn math at an older age, or is it like playing violin, you have to start at a very young age in order to become professional?
astroboomboy 3 months ago
This has been flagged as spam show
@astroboomboy that is linear algebra, you could check this out on Khan Academy
ramoncaldeira22 3 months ago
@astroboomboy on the course website (google it) it says you need linear algebra and probability theory, but it said you need basic linear algebra and probability and a little programming experience.
tessb 3 months ago
@astroboomboy Mainly calculus and linear algebra, you may pick up the two in 2-3 months if you're intent on learning as they're usually freshman level courses and have no pre-requisites themselves (you may also learn them concurrently as they are independent on the basic level and will intertwine easily as necessary). For Calculus, I recommend the James Stewart textbook, as for linear algebra, I recommend the text by Otto Bretscher. Both are illustrative, thorough and easy to follow.
turkiym2 1 month ago
The question at 42:32 made him sad...
a1a2a3skurr1l 4 months ago in playlist Course | Machine Learning
@a1a2a3skurr1l The asker was clearly sleeping off his calculus classes.
turkiym2 1 month ago
A couple of lectures in, it's surprisingly easy to get your head around this shit. Guess it all gets very tricky and intricate soon after, though.
chvan2335 4 months ago
In my course of linear systems we used the same normal equation for estimating parameters of a discrete model of continuous system.
The thing is, it can be derived in much simpler way than the one shown in the lecture. (without the use of traces, let alone the traces algebra) :)
So besides that, great lecture and certainly motivating.
Jacob011 4 months ago
hmm.. so alvin looks at the road ahead and records the steering direction. So what if the road ahead is a curve but since I'm on a straight patch for the moment my steering direction is still straight? Seeing where the cam was placed and that there was no bonnet in the pictures it must have been calculated for a few metres ahead. Does that affect anything? In the video it seems like Alvins response is about 0.5 seconds behind a typical human response. Specially in the live tests
kiriappeee 4 months ago
This is a very good intro course to AI! It's unfortunately easier to say from the outside that this is "something awesome" than if you had to take that class at Stanford in that classroom, and your work was graded in THAT class.
jazzrockr 5 months ago
This is soo much faster than anything we have here at Auckland University. You can definitely tell this is an Ivy League course.
axeld93 5 months ago
Please take it easy on the UM's!
muhammadshafei 5 months ago in playlist Course | Machine Learning
This lecture is a great one..I would like to apply to Stanford next year..It is really a great place of learning..
icommand 7 months ago
Comment removed
jcf139er 7 months ago in playlist Course | Machine Learning
I was completely ecstatic to learn that Stanford offered an entire introductory machine learning course online for free. And by completely ecstatic, I mean that, literally, I was jumping up and down in my chair. I can't wait to work through the exercises over the course of this summer!! Thank you so much Professor Ng, and thank you so much Stanford for affording me and others such an amazing opportunity!
jcf139er 7 months ago in playlist Course | Machine Learning 4
agree with caesiume. this type of lecture is great. both free and good.
abramswee 7 months ago
I love how useless the questions are in the grand scheme of things :P
emok31 8 months ago in playlist Course | Machine Learning
thank you very muuucccchhhhh :-)
pmsutube 8 months ago
nice informative lecture ... thanks for uploading
gorilah1 8 months ago 2
"Umm"
Darvon 8 months ago 2
Lecture 2 is done Sir (1:13 am).
See u 2morrow on lecture 3.
Thank you Professor. Thank you Stanford.
armanrainy 9 months ago
The lecturer should be from Beijing, China based on my brain pattern recognition. :D
ouoh1 9 months ago
@ouoh1 nope, he was born in England
hvutrong 9 months ago
@hvutrong Oh, my brain pattern recognition failed.
ouoh1 9 months ago
This has been flagged as spam show
Teaching abilities a bit on the weak side..
ozkansafak 9 months ago
Teaching abilities a bit on the weak side..
ozkansafak 9 months ago
Teaching abilities a bit on the weak side..
ozkansafak 9 months ago
this lecture makes more sense when you hit the 1911 button
jsymons1985 10 months ago
learning a whole new concept easily in one hour is fantabulous.......thanx...
rakeshprab1 11 months ago
Comment removed
rakeshprab1 11 months ago
My popcorn and pepsi are ready! Bring it on!
AnanyaVilas 1 year ago 7
@AnanyaVilas
My coffee and cigarets too :D
amanteo 10 months ago
See 3.24 for a nice movie on one of the first "DARPA" race-like challenges.
MrQuincle 1 year ago
Really Helpful! Andrew makes everything looks soo easy :)
Those who have trouble understanding, should really blame their lame colleges.
Cryto 1 year ago
I think Andrew missed the point of the last question, namely that (X^T X)^(-1) X^T indeed is a formula for the pseudoinverse of X (when m > n and X^T X is invertible).
vinkhe 1 year ago
1:04:05 Should the y by the y (i) instead with (i) be the superscript ?
phamnamlong 1 year ago
Comment removed
davethemovie 1 year ago
This has been flagged as spam show
@phamnamlong
Yes, I think so (but I'm no expert).
davethemovie 1 year ago
1:04:05 Should the y by the y (i) instead with (i) be the superscript ?
phamnamlong 1 year ago
Uhm. It's really not necessary to spend that much time on arriving at the normal equations... Basic linear algebra will get you there very quickly. Just reason about the subspaces of the matrix X (e.g. the error vector should be perpendicular to the column space of A => the error is in the left nullspace => done).
Otherwise great lecture.
MikaelUmaN 1 year ago
@MikaelUmaN
I also think that the conventional way to do OLS regression is simpler compared to doing a gradient descendant method. But perhaps the point here is the gradient method, not regression (so that they can use this method later for more complex models)
qiuxiaoqiu 11 months ago
The assumption that someone with an analytic understanding of learning would make a good teacher has yielded some disappointment.
daftrhetoric 1 year ago
Can someone plz explain the algebra fact: [ABA^C=CAB+C^AB^] where ^ denotes the transpose operator, thx in advance
killingenola 1 year ago
Why does it need to be so drab and boring? Bad lecturer. Make it interesting at least in the first lecture!!!
ceokevin 1 year ago
For batch and stochastic gradient descent, is alpha (learning rate) usually the same size?
taketaxisky 1 year ago
he is young for a professor
matharoofmaths 1 year ago
@matharoofmaths; Yes ... and that's why he makes so many mistakes in this lectures and has a hard time answering his student's questions (and occassionally evades student questions) in later lectures ... but if his research papers are any indication, he will definitely be an outstanding teacher in the future.
All criticism aside, this is much better than what we had before - nothing. Thank you Dr. Ng and Stanford for letting us in. This is making Machine Learning that much more accessible.
joshuaburkholder 1 year ago
Wonderful lectures. I was a Stanford undergrad in 1969-1973 and this is so much better than what we had to deal with before. Not so much that the teachers are better (though Andrew Ng is great), but that the pedagogical tools are so advanced.
liuedison 1 year ago 3
The best thing about video learning is the ability to rewind the course.
Juefawn 1 year ago 6
it's just like study in Stanford! Although it is not physically , but i really let me gain more knowledge of machine learning that only from my university. And he is really a good lecturer!
thank you for you guys that propose it to the Standford University and upload it!
wizztjh 1 year ago
Could someone explain how to get Gradient tr ABAC^T=CAB+C^TAB^T? I can't see how you can get an addition on the right hand side. At least not from within the rules he described in the lecture. Could one use the chain rule for derivation?
juludd 1 year ago
Comment removed
redearth1861 1 year ago
Comment removed
redearth1861 1 year ago
Comment removed
redearth1861 1 year ago
This has been flagged as spam show
@juludd It works like this: there are two A in ABA^TC, so when we take gradient, we gotta an addition of two terms, Gradient trABA^TC=(BA^TC)^T+gradient trCABA^T. The last term is the gradient of A^T. Continue, =C^TAB+(gradient trCABA)^T=C^TAB+((CAB)^T)^T=C^TAB+CAB
redearth1861 1 year ago
I am so interested in machine learning,I can program in java and I have MatLab student version. What previous Math would I need to better understand the tutorial?
Cheer.
Fusionicon 1 year ago
@Fusionicon Basic Calculus. Other than the weird stats stuff he brings into play when formulating the error function ("J"), you don't need anything else, so long as you really pay close attention.
Gaiacarra 1 year ago
This has been flagged as spam show
excellent work!
1888junkteam 2 years ago
I'm getting a Stanford education for free B) awesome
caesiume 2 years ago 71
@caesiume how?
kapildalwani 1 year ago
@caesiume the true Stanford Education happens on campus, espacially for PhD research. I mean you have three to five visitors every week to present some of best research work. ever made. I regret that I did not get my PhD from Stanford. If you are a PhD student just move to MIT or Stanford, the rest is junk.
abuhajara 1 year ago
These videos are brilliant!!Andrew is super cool at teaching, thanks Stanford!!
praneeta133 2 years ago 6
Awesome lectures.. I realise how good the Stanford and its professors are. Thanks a lot for believing in open education. May the force be with you.
kanobi14 2 years ago 34
This comment has received too many negative votes show
이거 도대체...머~꼬!
munmuwang 2 years ago
Its safe to skip first 9.30 min ...
harkes12 2 years ago
@harkes12
Actually, I found the intelligent vehicle quite interesting...
Mootsterdotcom 1 year ago
thank you very much
vinhbt123456 2 years ago 4
Thank you stanford ...really great work ...The lectures are great
hnomier 2 years ago 8
He's really good at teaching.
Foolean 2 years ago 12
he not just good...he is excellent.
outfile 2 years ago 5
Thank you Stanford
realmadridvideos 2 years ago 9
Stanford. Thanks for posting these lectures! Big thank you!
arran5498 2 years ago 90
Really nice. Well taught. I am really enjoying listening to these lectures. A true service to public.
mayaahmed 2 years ago 6
Where are the lecture notes posted?
stanza2200 2 years ago
stanza2200, look in the video description.
ninjakannon 2 years ago
good
bidexue 2 years ago
gradient of f(x.vector)= max increase rate. However the negative of it is called gradient decent
tcyue 3 years ago
Around time = 43:00, Dr. Ng again gave the wrong description of the gradient.
Example: Let f(x,y) = x^2 + y^2. Hence, the gradient is ( 2x, 2y ). At the point (1,1), the gradient is (2,2). Since the only local minimum of f(x,y) is at (0,0) and since (1,1)+(2,2)=(3,3), then the gradient at (1,1) points away from the only local minimum of f(x,y); therefore, the gradient does not point toward the direction of steepest descent. The gradient points in the direction of steepest ASCENT.
joshuaburkholder 3 years ago 4
what you are saying is right but it depends on your update rule as well. The update rule for descent has a negative sign in itself,
theta_i+1 = theta_i - alpha * gradient
so your above example would be
x_i+1 = x_i - alpha * 2
y_i+1 = y_i - alpha * 2
where x_i = y_i = 1
with an alpha of 0.5 your parameters become
x_i+1 = 0 , y_i+1 = 0
but yeah you are correct. He maybe had the update rule in his mind when he said that.
siddharthbatra 3 years ago 4
Around time = 28:00, Dr. Ng noted that to go in the direction of steepest descent from a point, ( theta1, theta2, J(theta1, theta2) ), we should go in the direction of the gradient of J at that point; however, this is incorrect. The gradient always points in the direction of steepest ascent, not descent; therefore, the direction of steepest descent from ( theta1, theta2, J(theta1, theta2) ) is opposite of the gradient: -Del( J( theta1, theta2 ) ).
joshuaburkholder 3 years ago 2
Around time = 28:00, Dr. Ng that if we want to go in the direction of steepest descent from a point J( theta1, theta2 ), then we should go in the direction of the gradient of J( theta1, theta2 ); however, this is incorrect. The gradient always points toward the direction of steepest ascent, not descent; therefore, if we want to go in the direction of steepest descent from a point J( theta1, theta12 ), then we should go in the direction that is opposite of the gradient ... -J( theta1, theta2 ).
joshuaburkholder 3 years ago
Some of this comment didn't come through properly. Please disregard this post. I will repost this comment with corrections.
joshuaburkholder 3 years ago
Yeah, that's why there's a negative sign.
Scutchris 3 years ago
This course is outstanding. Standford should make more courses like this available online
tbouloutas 3 years ago 7