 I have not given an example coming from real life for this least square solution so let me give an example the model comes from physics Hooke's law which states that the displacement of a spring is proportional to the force that is applied okay the displacement of a spring is proportional to the force that is being applied to the spring proportional means directly proportional this is a linear relationship okay that is if L is the length that the spring moves and if W is the weight that is being applied okay if L is the length that the spring moves if W is the weight that is applied then Hooke's law states that L and W are related by this formula L is a constant times I will call it alpha 1 W plus another constant alpha 2 L is the length W is the weight. So this is Hooke's law now in order to determine the numbers alpha 1 alpha 2 see this relationship is known the unknowns are alpha 1 alpha 2 in order to determine the numbers alpha 1 alpha 2 you do experiments experiments in the lab actually attach physical weight to the spring measure the displacement substitute into this and then you can determine alpha 1 alpha 2 for that spring but what usually happens in lab experiments is that we make errors so in order to avoid errors we make more number of experiments do more number of experiments and then try to see whether we can determine this relationship okay now when you do more number of experiments typically you will get an over determined system of equations over determined means the number of equations is more than the number of unknowns typically over determined systems will be typically inconsistent that is where you have least square solution coming okay so I will give an example suppose we have the suppose the experiment has been done let us say 3 times there are 2 unknowns it is done 3 times and I have the numbers in this tabular column let us say the displacement is I have weight here displacement here okay so let us say weight and here it is the displacement let us say I have 1 gram then 3 grams 4 grams and let us say we have one more experiment so I have 4 what is the displacement displacement is let us say 2 6.5 something like 8 and let us say 11 here these are the numbers let us say these are in inches whatever so this is the table that I have from this table for this particular spring I must determine this equation okay so you substitute these into this so this is W this is L and so I have the following equations this gives rise to the following 4 equations in 2 unknowns alpha 1 alpha 2 right 6.5 6.5 inches or millimeters whatever okay then we have the following equations L on the left I have 2 equals W is 1 alpha 1 plus alpha 2 6.5 is 3 alpha 1 plus alpha 2 8 is 4 alpha 1 plus alpha 2 and finally 11 6 alpha 1 plus alpha 2 okay I have these equations now you can verify that this will be inconsistent for example let us take equation 1 and 4 subtract 1 from the other alpha 2 is gone 3 alpha 1 is 6 so alpha 1 is 2 and alpha 2 is 0 I am just taking equations 1 and 3 alpha 1 is 2 alpha 2 is 0 but alpha 1 is 2 alpha 2 is 0 does not satisfy the fourth equation so this is inconsistent okay you can actually verify by the rank condition rank of A must be equal to rank of A, the right hand side column vector that will not happen here these ranks will be different so we need to but we need to solve this you cannot discard any discard any equation here each equation is as important as the other equation because these are experiments done under the same conditions in the laboratory so you cannot discard any of those so these have to be taken as they are but then they cannot be solved so what one does is to look at the least square solution that is the solution which minimizes norm A x minus B where the norm is the 2 norm okay so I am going to leave the rest of the calculations for you to complete okay determine this is like A x equal to B pre-multiplied by A transpose in this case can you tell me just by inspection whether the least square solution is unique we have given a condition last time the columns of A are obviously leaving it okay so A transpose A is invertible and so there is a unique least square solution for this problem okay so you please complete the rest of the problem okay let us get back to this problem of best approximation and look at this notion a little deeper that it deserves but before that let us look at the concept of this orthogonality a little into a little more detail let us take V as an inner product space so this is my framework let V be an inner product space and S is a subset of W not necessarily a subspace just a subset then the orthogonal complement the orthogonal complement of S in V I am going to define this it is denoted by it is denoted by S perpendicular and this S perpendicular is given by is given by S perpendicular is the set of all vectors X in V such that inner product of X with let us say S this is 0 for all S in S, S is not necessarily a subspace just a subset I collect all vectors that are orthogonal to each vector in S that is S perpendicular the name orthogonal is clear why is it orthogonal complement that is not clear it should be clear probably by the end of today's lecture S perpendicular it is this whatever be S S perpendicular is always a subspace okay the reason is that I see the second coordinate second argument is coming from S you can think of this as this as a linear map null space of a linear map this is a null space of a linear map okay so this S perpendicular is a subspace I am going to leave that as an exercise S perpendicular is a subspace for one thing 0 is perpendicular to any set of vectors so 0 belongs to S perpendicular subspace has to have at least a 0 vector okay so 0 belongs to S perpendicular is clear you can in fact show that it is a subspace okay let us look at two extreme examples what is V perpendicular set of all vectors orthogonal to every vector in V single term 0 it must be a subspace and what is 0 perpendicular it is V it is a kind of taking the complement here and then intuitively we might think of taking the double complement when we take the double complement we intuitively feel that we should get back the same that happens in finite dimensional spaces okay so these are two extreme examples let us then look at this notion see this is important in the context of the fact that if you are in a product space and if you have a finite dimensional subspace W then there is a unique vector U and W such that for every X in V given an X in V there is a unique vector U and W such that X minus U is perpendicular to W okay so one needs to understand W perpendicular W in our case is a subspace okay in general one can define S perpendicular in this manner without being a subspace okay so let us then go back to this problem to make the following definition I go back to the problem of the best approach let V be an inner product space and W be a finite dimensional subspace let us fix X in V then we know that there is a unique U in W such that X minus U is perpendicular to W associated with this X is that U okay so let us define a mapping I will call it E, E is a mapping on V, E is defined by E of X equals U, E of X equals U what is U? U is to emphasize where U is the unique best approximation to X from W whenever we define a map we must know that if you define a map if you want to define the image of an element X then you must know that the image is unique only then it is a function the fact that this right hand side is unique comes from the result that we proved earlier that if W is a finite dimensional subspace then there is a unique U okay so U is a unique best approximation to X from W this U is called the projection of X from V on W E is called see we denoted we call this U as a projection of X on W we are calling E as a orthogonal projection map in fact it is called the orthogonal projection map so just orthogonal projection of V on W in other words for every X there is a unique U I assign a map call that E then this E is called the orthogonal projection of V on W for one thing it is not clear that E is linear we will prove a little later that E is linear okay but before we look at the proof that E is linear and also derive some other properties there is also a relationship see we will show that E is linear then we will establish a relationship between the range of E and the subspace W okay but before that we look at a complementary notion a notion complementary to the orthogonal projection E and prove the following result I will illustrate all this by means of a numerical example I will do it next after proving this theorem. Let V be an inner product space W be a finite dimensional subspace of V doubles of finite dimensional subspace of V and E be the orthogonal projection of V on W E is the orthogonal projection of V on W the mapping I will call it F F from V to V defined by you will see that this F is related to E F from V to V defined by F of X equals X minus E X it is I minus E F is I minus E this is the orthogonal projection of V on what you expect the subspace to be W perpendicular E is the orthogonal projection of V on W I minus E you will show is the orthogonal projection of V on W perpendicular remember the name orthogonal projection comes because you have projected the vector X orthogonally on to the subspace W okay it is actually an on to map we will prove it to be an on to map but right now we will simply say on V E is a projection of V on W I minus E is a projection of V on W perpendicular these are on to maps in fact okay let us prove this first the function F we want to show is orthogonal projection of V on W perpendicular which means I must consider the best approximation problem instead of that is given X in V I must look at let us call it V given X in V I must look at small V now coming from W perpendicular which approximates X okay so let me write given X in V we are seeking V okay instead of V U and V let us call okay V is not a bad idea seek V in W perpendicular such that see this time we must understand this problem but this is similar to the previous problem given X we seek V in W perpendicular such that norm X minus V is less than or equal to norm X minus I will use Z if you do not mind Z belongs to W perpendicular I seek a V in W perpendicular which approximates X from among all those vectors in a W perpendicular so this Z belongs to W perpendicular arbitrary okay now this is the this is the problem for best approximation from W perpendicular I must show that this X minus EX is the best approximation to X coming from W perpendicular is not it in order to show that this is the orthogonal projection of V on W perpendicular I must show that this right hand side in order to show that this E is or the Y how do you get the orthogonal projection of V on W the right hand side vector U is the best approximation to the X that we started with U belongs to W the problem now is I must show that X minus EX that is why I called that as U I am calling this as V I am using V for this okay so let us call this V let us call this as V then I am I must show that V given X in capital V this small V is a best approximation to X from among vectors in W perpendicular is that clear then it follows that this F is the orthogonal projection of V on W perpendicular okay okay first of all I must verify that this what is the problem there EX equal to U U belongs to W I must verify that X minus EX belongs to W perpendicular okay I must verify that X minus EX belongs to W perpendicular that is the first thing line X minus okay I will call it V V equal to X minus EX belongs to W perpendicular that is we need to prove that X minus what is EX for X there is a U once I fix the X there is a U I must show that X minus U belongs to W perpendicular where U is the best approximation to X from W but this is something we have proved before in fact U is the best approximation to X if and only if X minus U is perpendicular to W that is X minus U belongs to W perpendicular this holds which holds since U is the unique best approximation to X from W U is the best approximation U is the unique best approximation to X from W because W is a finite dimension subspace this comes from the map E but we know that this happens if and only if X minus U is perpendicular to W that is the same as saying X minus U belongs to W perpendicular so this right hand side vector belongs to W perpendicular so it makes sense to talk about this function being the orthogonal projection of V on W perpendicular of course we need to prove further that this is what do we have to prove yeah we must show that among all vectors in W among all vectors in W perpendicular this holds this needs to be proved of course but this belongs to W perpendicular has been established okay we need to show something like this precisely this I have denoted this by V so we need to show precisely this okay but let us look at that so this is the first part I must show that this holds see I will show see we want to show this what I will do is show that norm X minus Z square is greater than or equal to norm X minus V square okay I want to show norm X minus Z whole square but look at X minus Z I can write this as X I can write this as EX so I add and subtract EX EX plus X minus EX minus Z so I have added and subtracted EX see this Z comes from W perpendicular okay so for Z in W perpendicular we have this so norm X minus Z square is norm EX plus X minus EX minus Z square look at EX now the first term say I am going to apply Pythagoras theorem look at the first term first term is EX EX by definition belongs to W so this belongs to W look at X minus EX minus Z we have just now shown that X minus EX belongs to W perpendicular Z is coming from W perpendicular W perpendicular is a subspace so this vector belongs to W perpendicular so they are orthogonal so I can apply Pythagoras theorem so this is norm EX square plus norm of X minus EX minus Z square this is non-negative so this is greater than or equal to norm EX square okay but we want to show that it is greater than or equal to norm X minus V but what is V? V is X minus EX so can I write this as norm X minus V because since X minus V is X minus V is X minus EX which is EX so this is greater than or equal to norm X minus V square so it follows that norm X minus V is less than or equal to norm X minus Z for all Z in W perpendicular okay so this is the first result then if V is a projection of V on W then I minus E is a projection of V on W perpendicular we have not yet proved that E is a linear map but before we prove that E is a linear map I want to look at an example okay and then this example will kind of act as a sandwich between this result and the next result so let us look at the following. V has R3 with the usual inner product let me take W to be the subspace spanned by this vector and for convenience I will take rho vectors what I want to do is determine E okay W is given to me I want to determine E and derive properties for I minus E okay W is span of this let us say I want to look at the vector X as 1 minus 1 1 I can determine a general X itself I want to determine the projection so I will take a general X 1 X 2 X 3 I want to determine the projection E I want to determine the projection E for this problem W is given okay. What is the definition of E? E of X is U where U is the best approximation where U is the projection of X on W and I have a formula for that W is finite dimension W is 1 dimensional I will simply take an orthonormal vector just divide this vector by the norm okay so can you tell me what E X is? E X is U I want a formula for U so you remember the formula for U if U 1 U 2 etc U n is an orthonormal basis for W then U is summation J equals 1 to n inner product X with U J into U J orthonormal basis I have just an ordinary basis but I there is only one vector I divide by the norm so can you tell me what U is? It is a inner product of X with U J the norm of this is root 6 right so 1 by root 6 it will go with another 1 by root 6 so it is 1 by 6 times X with 1 to 1 into 1 to 1 which is X 1 plus 2 X 2 plus X 3 by 6 times 1 to 1 so this is E of X obviously linear okay E is linear see what I have done is to take an orthonormal basis for W W is 1 dimensional so I am just dividing by the norm which is 6 root 6 and this root 6 will come twice because X U J U J okay so this is my EX what is the null space of E what is the null space of E what is the range of E? See in general we cannot talk about range of E because we have not yet shown E is linear but in this example W E the right hand side is in W so range of E is W what is null space of E? Null space of E is a set of all vectors X such that E of X equals 0 set of all vectors X such that this numerator is 0 set of all X such that X 1 plus 2 X 2 plus X 3 is 0 this single equation 2 unknowns so there are 2 independent solutions so it is a 2 dimensional subspace what is that subspace? It is okay I will just write I will just write 2 independent vectors not necessarily orthogonal I can take 1 minus 1 1 0 okay 1 0 minus 1 is 1 vector the other one is 0 2 minus sorry 0 1 minus 2 is there a relationship between these 2 vectors and this vector they are orthogonal okay this is orthogonal to this this is orthogonal to this these 3 together will form a basis for R 3 but what is important is this is equal to W perpendicular this does not happen for a general linear transformation what does not happen the range is the subspace W null space is the perpendicular in general this does not happen but for an orthogonal projection this will be true okay you can in fact write down the formula for the orthogonal projection of R 3 on W perpendicular it is just I minus E but I also want you to observe the following E square is E can we check that quickly E square is E E square of X is E of E X that is E of E X is X 1 plus 2 X 2 plus X 3 by 6 times 1 to 1 this is scalar X 1 plus 2 X 2 plus X 3 by 6 what is the image of E image of 1 to 1 under E image of 1 to 1 under E X 1 plus 2 X 2 plus X 3 by 6 but that is 6 6 by 6 is 1 so that scalar is 1 this is just X 1 plus 2 X 2 plus X 3 by 6 into 1 1 1 which is E X that is yes so E square is equal to E if E square equal to E then I minus E the whole square will be equal to I minus E I minus E minus E plus E square but E square is E so these 2 terms get cancelled so that is I minus E. So if such operators are called idempotent operators a matrix A is said to be a transformation T is said to be idempotent of T square equals T so if T is idempotent then I minus T is also an idempotent I will leave the following problem for you to complete what is the matrix of E under the standard basis is there a structure for that matrix E square equal to E alright so the matrix will also satisfy the property but there is another structure for the matrix you can think of so E is a linear transformation you can think of associating a matrix given a basis now this matrix has a special property in this case in this example I want you to see what it is okay any guesses what is the question what is the question E is a linear transformation on R3 I can write down the matrix of E relative to let us say the standard base relative to any basis let us say the standard basis there is a structure for E there is a structure for the matrix of E I am asking you to explore you see what it turns out to be but is there any guess never mind okay so let us now prove so we have proved certain things for the numerical example that is of course E is linear we have discovered that E is idempotent then we have also observed that the null space of E is W perpendicular okay so let us prove this in the general case in a general inner product space this is happening for R3 with the usual inner product let V be an inner product space W be a finite dimensional subspace E denote the orthogonal projection of V on W then E is an idempotent linear transformation E is an idempotent linear transformation idempotent means E square equal to E what we proved in this example E is an idempotent linear transformation such that null space of E is W perpendicular range of E is W is obvious by the definition null space of E is W perpendicular and what is more important is that remember V need not be finite dimensional W is finite dimensional okay more important is V can be written as W plus W perpendicular where this plus is the direct sum which means that W intersection W perpendicular is singleton 0 intersection of two subspaces of course W plus W perpendicular is equal to V that is any element X and V can be written as a sum of two vectors one from W one from W perpendicular this will be done in a unique way because the intersection is singleton 0. Now look at go back to this definition this perpendicular was called the orthogonal complement of W this is the reason if W is finite dimensional then there is always an orthogonal complement of W orthogonal complementary subspace V may be an infinite dimensional in a product space but it can always be written as a direct sum of W and W perpendicular whenever W is a finite dimensional subspace okay if you demand W to an infinite dimensional subspace the answer is no in general but whenever W is a finite dimensional subspace you can write V as the direct sum of these two subspaces okay this should also remind this this equation should also remind you of something that we have studied before not ranked on the dimension theorem. In helical something what I am asking you this question what does this equation remind you of see dimension will be a problem V is infinite dimension possibly infinite see there is no we have not restricted our attention to V being finite dimension so we do not know whether the dimensions will add up if V is finite dimensional the dimensions will add up but V is infinite dimensional W is infinite dimensional then these things do not make sense but I am asking you does this remind you of something we have studied before forget about the perpendicular let us say I write V as W1 direct sum W2 have we done something if we are given such a direct sum decomposition see if V is written as W1 plus W2 etc Wk then we have shown that there exists k linear maps k idempotent k linear maps E1 E2 etc Ek such that E1 plus E2 etc Ek is identity then Ei Ej is 0 if I is not equal to j that we have seen and we have also seen a connection between this and Eigen values of a linear transformation okay but forget about Eigen values of a linear transformation what we have seen is a converse of what is happening now what is happening now is we have defined a map and we have defined a map on a subspace W finite dimension subspace W and through this map we are getting a direct sum decomposition what we have done earlier is a converse given a direct sum decomposition we have constructed maps we have constructed maps which are precisely idempotent okay they do not correspond to orthogonal direct sum decomposition but is a usual direct sum decomposition the difference between an orthogonal direct sum decomposition and the usual decomposition will be clear a little later that has got to do with my question that I asked earlier what is the structure of V when you write down the matrix of V okay these are intimately connected but I just want to remind you that this converse question we have seen before it is not an orthogonal projection it is it is just an ordinary projection so the question is what is the difference between orthogonal projection and an ordinary projection what is the difference between an orthogonal projection and an ordinary projection? Just think it over orthogonal projection we have seen that ordinary ... I remember having used the word projection also earlier E square was equal E square equal to E happens here also but there is something more to E again that question what is special about the structure of E the matrix of the linear transformation E that is the extra thing which connects you to the perpendicular okay. So we have seen that we have studied the converse question in a little more general sense. If V is W1 plus W2 H2 Wk then we have constructed idempotent maps E1 etc Ek which have the property that their sum is equal to identity okay. Let just remember that see you learn a new concept you need to relate the new concept with what we have studied earlier. So I am just reminding you that the converse question has been studied not in this context of perpendicular but a little more general. Sum of direct sum of several subspaces okay I will maybe just prove that it is idempotent okay. So proof I want to show E is idempotent okay but okay so let me emphasize let X belong to V then Ex is the best approximation to X from W that is Ex belongs to W okay. If X belongs to W what is Ex yeah what is Ex? If X belongs to W then X is equal to W what is Ex? Ex is U, U is X so Ex is X. If X belongs to W then Ex is X, if X belongs to W then Ex is X that itself is the best approximation. So E acts like identity on W okay what is E of E of X then for every X and W E of E of X that is E square X okay but look at Ex, Ex belongs to W, E of something and W must be itself that is Ex that is we have shown this for each X and V so E square is E okay so E is idempotent that is really straight forward, E is idempotent okay we need to show E is linear let me do that we need to show E is linear let X, Y belong to V and let us call Z as alpha X plus beta Y alpha beta coming from the field I must show that E of Z equals alpha Ex plus beta Ey it would then follow that E is linear this is a definition E of alpha X plus beta Y is alpha Ex plus beta Ey if I show this then E is linear okay let us consider X minus Ex and Y minus Ey the function the mapping F from V to V F of X equal to X minus Ex that is an orthogonal projection of V on W perpendicular these two elements are in W perpendicular these two vectors belong to W perpendicular okay X minus U really X minus U Y minus V if you want they must be orthogonal to W so these are vectors in W perpendicular W perpendicular is a subspace so a linear combination alpha times X minus Ex plus beta times Y minus Ey that must also be in W perpendicular what is this this is alpha X plus beta Y that is Z alpha X plus beta Y minus alpha Ex plus beta Ey this belongs to W perpendicular now this Z comes from V Ex Ey belong to W W is a subspace this combination is in W so this comes from W but this difference Z minus let us call it V let us call this V okay equal to V that belongs to W okay now the difference Z minus V belongs to W perpendicular that is okay what does it mean Z minus V W sorry I will use some other Y maybe no not Y Y is already there P Z minus V P equal to 0 for every P in W perpendicular Z minus V is orthogonal to every P in W is this the same as saying that V is equal to E of Z is this the same as saying if X minus U is orthogonal to W then U is the image of X under E if X minus U is perpendicular to W U is the image of X under E if X minus sorry if Z minus V is perpendicular to W it means V is the image of Z under E by definition okay we are through V is alpha Ex plus beta Ey on the one hand E of Z Z is alpha X plus beta Y so E is linear so E is linear is that clear what is E of X E of X is equal to U where U satisfies the property that X minus U is perpendicular to W I am writing E of Z equals V because Z minus U is Z minus V is perpendicular to W okay the second last part that V is a direct sum I will prove it in the next class.