 So, here we are with column span of A belonging to the image of P right. So, suppose what vector should I now consider some exotic symbols may be. So, xi ok, let xi belong to the image of P. Therefore, there exists zeta ok such that xi is equal to P zeta yeah. So, what do I have to show I am trying to show containment. So, I started with something in the image of P I want to show that it belongs to the column span of A, but is that not really trivial because now this is nothing, but A times A transposed A inverse A transposed xi and focus your attention on this object. What is this? Let us call this not so exotic just world y. What is this y? It is a vector of the appropriate size of course. So, this y is what? What do you mean by this? If you have a xi that will that can be written as A y then it obviously belongs to column span of A is it not. The other side is that might look a little long winder, but it is even more simple to check. So, now next we suppose suppose ok let us just say v I am tired of those symbols. So, let v belong to column span of A. It means v is equal to A w for some w. I am not even talking about the sizes of those. You already know from the size of A which is m cross n what the sizes of v and w are to be w is an n tuple v is an m tuple yeah. That is all clear. So, now look at p v. Can you guess what is the intuition behind this? I already know that if something is a projection that it maps fellows in the within the subspace to themselves. So, if I can show that something that is in the column span is projected back to itself I will be done which is the premise behind looking at this object. I should be able to show that this is nothing but v itself right. So, this is going to be A A transposed A inverse A transposed and what about v now it is A w A w. So, again this A transposed A gets pulverized by this inverse here. So, this is nothing but A w, but A w is nothing but v just as required here. So, therefore v definitely belongs to the image of p and therefore from here I have image of p contain inside column span of A and from here I have column span of A contain inside image of p based on these 2 observations. Of course I might as well conclude that column span of A is equal to the image of p. So, column span of A. So, u is equal to column span of A is equal to image of p. Final thing is the kernel. What is the kernel of the column span? Every vector that is perpendicular to the columns. So, how do we represent such a vector? Once we have the matrix with its columns already sitting as these vectors these stacked up all vectors. Do we not write it as some x transpose let us not call it x let us call it z transposed A is equal to 0 or in other words A transposed z is equal to 0. So, the kernel of A transposed yeah. So, what we have to essentially show is that the kernel of A transposed or we have to check that the kernel of A transposed is equal to what? Because the kernel of A transposed is exactly the orthogonal complement of the column span of A. Do you agree with this? This is a pivotal step. What am I saying? I am saying that if you have z transposed A is equal to 0 implies and implied by z belongs to kernel of A transposed and this is true right. So, let us take the transposed on both sides. So, this is basically A transposed z is equal to 0. So, therefore, this is the kernel and what I am also saying is that z is perpendicular to column span of A. Is it not? By the very definition this is the inner product z is inner product with individual columns of A must lead to 0 for this to be a 0 row vector. Now, every such z belongs to kernel of A transposed therefore, kernel of A transposed must be the orthogonal complement of the column span of A right. This is clear. So, if this is clear then this proof this last property just checking this last property reduces to just checking this agreed. So, how do we go about this? It is actually very straightforward even though might appear a little daunting at first glance. Again, let z belong to kernel of A transposed it means A transposed z is equal to 0, but it also means that A transposed A whole inverse A transposed z is equal to 0 with an A ahead of it. Of course, if this part is already 0 if you pre-multiply it with this implies z belongs to kernel of P because this is after all my P. So, one part of the inclusion is very straightforward that kernel of A transposed is contained inside kernel of P. The other part might appear a little tricky, but once you observe certain things here. So, let what should I call this now? W belong to kernel of P which means that A A transposed A inverse A transposed W is equal to 0, but what do we know about A full column rank? If A has full column rank let us concentrate on this and call this what should I call it P. So, this implies A P is equal to 0, but when is that possible? Since A has full column rank this is only possible if P is equal to 0 it has a trivial kernel full column rank. So, P is equal to 0 if P is equal to 0 then this object that I have put under this under brace here must be 0. So, this means A transposed A the whole inverse times A transposed W must be equal to 0. What else do we know? This is an invertible matrix of course it is written in terms of an inverse. So, again let me call this Q. This means A transposed A inverse Q is equal to 0 which obviously leads to the conclusion that Q must be equal to 0 now which means Q is equal to 0, but what is Q? Q is nothing but A transposed W which means that only those objects even though this matrix came as a result of several pre multiplications, but because of the full column rank and the invertibility of the corresponding matrices such as A and A transposed A inverse only those objects which belong to the kernel of A transposed also managed to be in the kernel of this projection matrix and nothing else but that yeah. So, these kind of tools I am pretty sure you will find them handy in several other arguments not just in this context when you have multiple matrices multiplying one another and you have of course if the right most fellows kernel contain something then that belongs to the kernel of the overall product, but when is that the only sort of fellow that contains that is contained in this? If the fellows to the left of them keep having full column rank if you can piece them together and they all have full column rank then at no point can you break this or split it up and say that oh hang on there is some non-zero vector for which this can vanish it cannot because they all have full rank. So, if you keep multiplying full column rank matrices one after another and then the last fellow is probably short of that full column rank then that is all that you can do right it is only the kernel of the last fellow that will be the kernel of the overall matrix a very general observation just from this right which means that q sorry w belongs to kernel of a transposed. So, if you have something in the kernel of a transpose it is in the kernel of p if you have something in the kernel of p it must be in the kernel of a transpose. So, even this property is verified what about the orthogonality something we have not spoken about we call this projection is it really doing what we wanted to do that is is it really cooking up this b hat in a clever way. So, that this b minus b hat is indeed in the orthogonal complement of the column span yeah. So, we will just go ahead and take that inner product to check okay. So, b minus b hat inner product with what is how can you represent an object that lives inside the column span of a as some a v yeah. So, this is going to be equal to b minus recall this was a a transposed a inverse a transposed b it is inner product with a v which is nothing but v transposed a transposed b minus v transposed a transposed a into a transposed a whole inverse into a transposed b once again this a transposed a and a transposed a inverse lead to identity. So, you have v transposed a transposed b minus v transposed a transposed b which is 0 as required right. So, you could have come at it from this approach as well to show that this is indeed the best approximation. But then we also wanted to reveal to you the fact that this matrix which you might have come across earlier also in some preliminary courses on linear algebra where you talk about pseudo inverses and stuff. This is also an example of a projection map projection map is of course the generalization of this notion over abstract vector spaces. But when you are talking about tangible Euclidean spaces then this matrix the projection map becomes a matrix of this form. Before we conclude because this is going to really put the seal on the first part of our course which is the solution of Ax is equal to b. I am just going to quickly summarize what are the possibilities that you might encounter. One is Ax is equal to b is a square system. In that case what do you do? If A is invertible just go ahead and invert it something you have been doing since your I do not know plus 2 or even before that yeah we have done that great. Next if you have as in this case more number of experiments than the number of unknowns to be determined you have a tall matrix an over determined system. If your full column rank this is great. Suppose you do not have full column rank do you stop? Maybe it is an ill designed experiment there is some dependence in the parameters. What do you do? Well you can always say go back to the physics and rectify it and give me a tall matrix with full column rank that is one way but let us say that is what is done is done. What do you do then? The idea should be very obvious now. You still go ahead and look for the basis for the column span. So, maybe all the columns are not linearly independent but the column span is a vector space after all finite dimensional vector space sitting inside r m. So, it definitely has a basis. So, pluck out suppose the rank of that matrix is r. So, you pluck out r linearly independent columns that gives you a basis for the column span. What is the next idea? Once you have the column span what do you need to do? You have a basis for the column span perhaps orthogonalize it using Gram-Schmidt yeah. I am talking about computational efficient technique. See I am talking about a lot of things because this is an application but I mean if I take it down as an example it will take like maybe a lecture one hour for me to calculate here on board. But the idea is simply this you do not stop you take the r linearly independent columns that span the column span. So, if you have a any basis in an inner product space we have seen you can have Gram-Schmidt orthogonalization to get an orthogonal basis corresponding to that basis and once you have an orthogonal basis it is beautiful because now you have to just take this vector b and take its inner product with the orthonormal set and that gives you the best possible projection approximation right that is it. So, there is a very beautiful way of doing that which you will see later maybe not in this course I will see if we have time, but there is something called singular value decomposition which gives you these orthogonal basis for the kernels and these images ok. No it does not matter you can choose any of those out of those m sorry n columns you pluck out any r linearly independent fellows you will still live inside the same space because there are no more than r linearly independent you take your poison or your reward whichever r you want to. You may not be able to choose all possible n choose r because some of those will probably be trivial choice if you have repeated columns for instance whether you choose this one or that one in your set matters not. So, that is the idea right. So, you pluck out this vectors that span the column span do this Gram-Schmidt procedure to orthonormalize them and once you have to orthogonalize them you have to just normalize them and divide them by their norm. If you have not orthonormalize if you have orthonormalized them already just go ahead and take the inner product of the b the vector on the right hand side with each of these basis vectors no matter what basis you choose or what basis your friend chooses you will still come up with this because the best approximation you see is unique right ok great. So, all this is great. So, this even if it is not full column rank we have outline what you can do. So, for a square matrix invertible grade not invertible what can you do? Again same idea best approximation what can you do? If the rank of a square n square n n cross n matrix is r you still have at least r of those columns that are linearly independent again go ahead and pluck out those r columns it is nothing different from that right. And then based on those r columns that you picked out again you can do the Gram-Schmidt orthogonalization and take the inner product of this vector on the right hand side and that is the best possible thing you can do. You are asked to match this vector here with vectors on this table here you cannot obviously match it the best you can do is you can just take several fellows there are no orthogonal basis and take the inner products and see that that is it. Now suppose you have a fat matrix or a wide matrix is the case maybe with full row rank we know that in such cases solutions exist, but the multiple solutions may exist. So, in that case the problem becomes a little more open-ended depending on the application. So, you might want to say oh hang on these are my unknown coefficients x 1 through. So, if it is n cross or m cross n where n is greater than m yeah. So, you have n parameters now and you performed m experiments which are fewer than far fewer than the n parameters. But there are certain parameters that you let us say you say oh I know from this they are very small negligible close to 0 and so on. So, you have to use your domain specific knowledge there that is why you know even if it is applied linear algebra I cannot go into every possible domain it is domain specific knowledge at work there. You have to be smart about what kind of a solution you will choose the solution is obviously going to be non-unique for a full row rank matrix. The solution is obviously going to be non-unique it is always going to exist some solution is going to exist because the column span is then the entirety of r m and it is got more columns then. So, any vector in r m that you choose like this b vector to b it can always be matched but the problem is now there is non-uniqueness. So, out of these non-unique infinite possible choices that you can have you know you can add any constant times the vectors in the kernel and those will also be a solution in this case because there will always be free variables remember when you have a wide matrix. So, then you have to optimize it if on the other hand you are really rotten luck and you end up with a wide matrix which is not even full row rank. I would say it is a very badly designed experiment and probably you should go back to the drawing board but let us say you end up with it then what do you do? Now, you again get back to this problem just because it is a wide matrix does not mean a solution is guaranteed unless it is full row rank. So, if it is full row rank wide matrix multiple solutions that is one sort of problem not full row rank again we are back in the same domain. So, it is short of the number of rows. So, now you have to again look at the column span it is not going to be the entirety of r m sorry r m because there are m rows there. So, it is not the entirety of r m it is some subspace of r m which is the column span. So, again you have to go back and try and approximate this and get the best possible approximation right. So, in case it is already there in r m in the subspace this b vector is already there in the column span despite the column span being deficient it is already there in this right. So, then what do you do anyway then it essentially means if you go to the row reduced echelon form some of those equations devolved to 0 is equal to 0 just get rid of those equations and you have full row rank again. See what I am saying if you have wide matrix with less than full row rank, but the right hand side the b vector sitting in the right hand side is still in the column span then carry out the row reduced echelon form operations go back to the RREF and you will see that you will have some 0 rows on the left hand side and some 0 entries on the right hand side exactly matching them. So, those equations anyway becoming material and you have now r cross n system of equations where r is the rank now you are again in the domain of full row rank. So, I hope that the reason why I lingered a bit on this discussion is I hope that with this if you encounter any kind of situation with real or complex matrices because those are exactly the kind of things we call as inner product spaces R n's and C n's any real or complex problem of the form A x is equal to b if you are asked to solve for them right. Because you are living inside an inner product space you should not say there is a solution there is a unique solution there are multiple solutions there is no solution and that is it I am done. If the application demands that you go ahead and push your luck and see what is the best possible solution you should know what the best possible solution is right. So, that is the idea with that we are brought to a close our discussions on A x is equal to b a lot of the discussions we have had this hyper planes these affine sets they are all predicated essentially on this understanding of A x is equal to b because remember we have spoken about this hyper planes right at the beginning like each of those equations is like a hyper plane no hyper plane with a bias what do you think that is that is like a coset right like those cosets that we discussed as the quotient space the objects in the quotient space each of those equations is like an object in the quotient space right and it is this slide that you should see how that first isomorphism theorem also reflects on the rank nullity theorem in case you have finite dimensions right and those things can be represented as matrices. So, in the next lecture we will probably discuss a few more applications of inner products before we close this topic entirely and we make a paradigm shift into a completely different topic which is the eigenvalues and eigenvectors and we will motivate the importance of that problem. Thank you.