 In the last lecture we talked about static deterministic linear inverse problem which are well posed why it is well posed because we did not find any difficulty in the solution process that is largely because the matrix H is well well is full rank because the matrix H is full rank H transpose H is symmetric positive definite H H H transpose H H transpose H both of them are full rank matrices therefore we could define the generalized inverse of H both in the underdetermined case and the overdetermined case we did not have any problem we sailed very smoothly by solving the normal the normal equations. So it makes sense to ask ourselves the question what happens if H is such that is not a full rank and that corresponds to ill posed problem. So ill posed version of the static deterministic linear inverse problem is what we are going to be looking at and we will also talk about what is called imperfect model along the way. So Z is equal to H of X is the linear square problem H is the matrix which is m by n in module 3.1 we already talked about the well posed problem linearly squares the well posed problem essentially essentially banks and the assumption that H is a full rank that means the rank of H is minimum of m and n. So when the overdetermined case the rank of H is n in the underdetermined case the rank of H is m. So we have considered both the cases under one rule namely the rule of H being a matrix of full rank. Now we are considering the complementary case where what happens when H is rank deficient. So what does it mean when the rank of H is not equal to the minimum of m and n but is less than the minimum of m and n. So in the overdetermined case it is not n but less than n in the underdetermined case it is not m but it is less than m. Such problems are called ill posed problems. So well posed versus ill posed largely determined by the properties of the matrix H please remember H Z is equal to H of X defines the static model the properties of the model is defined by the matrix H. So we considered one aspect of the properties of the model now we are considering another aspect of the properties of the model full rank versus rank deficient. In the when the H matrix is rank deficient the Gramian matrices H transpose H and H transpose are symmetric but they are not positive definite but they are singular that means the determines are 0. That means at least one of the Eigen values is 0 if at least one of the Eigen values is 0 it is not positive definite is only positive semi definite. So positive semi-definiteness of H transpose H and H H transpose leads to singularity and that leads to the fact I cannot simply now compute H transpose H inverse or H transpose inverse I cannot compute them because they are all singular I cannot do I cannot I cannot compute these inverses readily these do not do not exist. So when these do not exist I cannot follow the principles that we have developed in the last lecture. So this calls for newer methods to be able to handle and we are going to be talking about some of these newer techniques. To reinforce further when H is rank deficient when H is rank deficient we cannot use the formula for the generalized inverse of H this is the generalized inverse of H for the over-determined case this is the generalized formula for the generalized inverse for the over-determined case this is the formula for the under-determined case these generalized inverses that are used in computing XLS now cannot be done now cannot be computed because they cannot be computed I cannot use the old pathway to compute the solution. But the theory of generalized inverse still tells you while I cannot compute the generalized inverse using these formulas that is other ways of computing H plus I can still compute H plus but using the method called singular value decomposition. We will talk lot more about singular value decomposition in one of the modules coming later and using this one can solve even though one can solve using singular value decomposition at this juncture we are not going to use this we are simply looking for an another alternate formulation of linearly square problems. So, let me clarify where we are we have a linearly square problem Z is equal to H of X H is rank deficient I cannot use the method that I have already described thus far rank deficient cases can be handled by one of the methods called singular value decomposition we still have not developed the method of singular value decomposition now we will do it later. So, while we will have occasion to revisit the solution of this problem at the time when we describe the linear singular value decomposition at this time I am still interested in solving this. So, I am seeking an alternate method alternate to singular value decomposition that method is called method of regularization this method of regularization was introduced by a Russian mathematician called Tykanov this method is meant to get around the rank deficiency of H is a very simple elegant method to approximate solutions of ill post problems rank deficient problems ill post rank deficient I am using this anonymously. So, what is Tykanov method? Tykanov method is a modification of the method that we use in solving the under determined case in the under determined case we want to we seek to minimize the norm of H but we require Z minus H to be a strong constraint and we use Lagrangian multiplier what did Tykanov say Tykanov said you still consider a factor that corresponds to the square of the norm of x alpha is a parameter Z minus H of x norm. So, this is square of the norm this is square of the norm alpha is a kind of a penalty parameter. So, this is what is called a penalty function approach. So, what is that we are looking for I am looking for a solution f of a solution that minimize f of x the solution that minimize f of x is such that alpha times the square of the norm must be as small as possible and Z must be as close to H of x as possible but not exactly 0. So, the addition of alpha x square term to the traditional sum of square criterion helps to avoid the challenges resulting from rank deficiency. Now, please understand in the over determined problem we f of x was essentially this term for the over determined case for the over determined case we essentially use f of x to be this. Now, to that f of x I am adding this term by adding this term this is a kind of a penalty term. So, we mix several like other and concoct a new objective function and our problem is to be able to minimize this f of x and let us talk about the impact of this after we solve the minimization problem. So, rewriting f of x as in equation 2 we can readily compute the gradient and equate the gradient to 0 if you equate the gradient to 0 I get a solution for XLS by setting this to 0 I get the solution to be like this. So, you can see least square solution is a function of alpha H transpose H plus alpha I inverse H transpose Z when you set alpha is equal to 0 if this becomes a solution of the over determined case. So, you can think of this as a generalization of the concept of solution for the over determined case. So, the over determined case I consider only this term we added that. So, the alpha term is the one that is added to this. Now, look at this now H transpose H by itself is singular, but I am adding alpha times an identity matrix. So, this is called diagonal perturbation this is this changes only the diagonal elements of the matrix H transpose H. So, by adding a diagonal perturbation to a singular matrix I can make the whole matrix non-singular if the matrix is non-singular I can compute the inverse if you can compute the inverse I have the solution. So, what is this we are not solving the original problem we are solving a modified problem the modification is obtained by adding a diagonal perturbation to the Gramian H transpose H. So, since H transpose H is singular by adding a diagonal perturbation alpha I to H transpose H we ensure that H transpose H plus alpha is non-singular when this is non-singular I can compute the inverse if I can compute the inverse I have an expression for the least square solution. So, the expression for the least square solution for this ill-posed problem is given by is given by the equation 3 in slide 4 equation 3 in slide 4 now you may ask a question how do I pick that alpha in fact I am going to give an intuitive fill the alpha that you need to choose must be the smallest alpha that will make the matrix H transpose H non-singular this matrix by this matrix by itself is singular I am going to add a perturbation I would like to require this whole matrix to be non-singular. So, I want to ask myself the question what is the least alpha that I should use in order to render the resulting matrix to be non-singular such an alpha I always exist it can be proven so this is a very nice generalization. So, when H is a full rank we simply set alpha is equal to 0 when H is rank deficient you pick the least alpha that will make this matrix non-singular. So, once you have picked the least alpha that will make it non-singular I can solve the linear least square problem I have a least square solution the existence of such alpha is guaranteed by a theorem in matrix theory called Gershgorin circle theorem using the Gershgorin circle theorem you one can estimate the least value of alpha one can estimate the least value of alpha that is needed to be rendered this is non-singular such a thing exists it is a very simple result. So, what is that we are trying to do by formulating the problem as a penalty function problem we can even solve an ill-posed problem nicely. Now, I am going to talk about the use of matrix identity I hope you all remember that we have talked about several different matrix identities when we dealt with the module on matrices you have very well known matrix identity takes this following shape given by equation 4 this identity is very well known in matrix theory. Now, I am going to start with this well known identity I am going to specialize A is equal to H B is equal to I D inverse is alpha I why I would like to be able to use this identity in the Takanov solution to see what is the relation between the under determine or the over determine case that is what our aim is this identity with this substitution now becomes this using this identity this left hand side is equal to the right hand side. So, this left hand side becomes this the right hand side becomes that the right hand side can be simplified and that becomes this. So, look at look at look at this now H transpose H plus alpha I inverse H transpose is equal to H transpose alpha I plus H H transpose inverse these two problems these two matrices are equal that is the essence of equation 5 from here now I can do I can see the lots of lots of little things if I said alpha is equal to 0 this becomes a solution for the over the left hand side becomes a solution of the over determine system the right hand side becomes a solution of the under determine system. So, Takanov by introducing this penalty factor alpha was able to unify the solution for the over determine under determine case by invoking to this very well known matrix identity. So, that is the beauty of the solution of Takanov. So, Takanov solution is important in two ways one it helps to solve the ill post problem another one it helps to unify in trying to define the relation between under determine over determine. So, you you you kill two boards in one stroke that is the beauty of the work by Takanov. Takanov has specialized in solving inverse problems of various type he has written a marvelous book that deals with regularization methods for solving inverse problem. This is one of the simplest of the methods of regularization that is often used in solving linear least square ill post problems. So, I have already talked about the unifies approach when alpha is 0 6 leads to optimal solution for the full rank problem when m is greater than n when alpha is 0 in 7 that leads to the optimal solution for the full rank problem and under determine case. So, all the previous solution can be obtained as special cases from this unified approach and hence the importance of of of of Takanov of of contributions. Now, I am going to talk about the role of model perfect versus imperfect under static constraint. So, models are static models can be perfect model can be imperfect the same goes no model is perfect, but some models are useful often one assumes the model is perfect imperfections in a model comes from various directions. The imperfect imperfections from the model can come from incomplete physics not complete physics sorry the imperfections in the model can come from incomplete physics or wrong parameterization or any combinations thereof or other reasons irrespective of whether the model whether is perfect or not in the over determine case the model is always in the problem is always inconsistent. In the sense that we saw in the previous lecture over determine systems are always inconsistent in the under determine case the choice of the method depends on whether the model is perfect or not. So, if the model is perfect we can formulate wrong way the model is imperfect we can format the other way. So, you should have a good feel for how good the model is. So, that brings us to the notion of different ways of formulating the solution of least square problems when the model is perfect when the model is imperfect that gives rise to a new way of looking at it called strong constraint versus weak constraint formulation. When M is less than M the model and the model is perfect. So, I am considering I am considering an under determine case there are less observation than number of parameters and also the model is perfect and that is the assumption we already made. We never question about the veracity of the model until now only now we are trying to ask the question is the model perfect if the model is not perfect there is one way if the model is perfect the other way. If the model is perfect I would like to be able to enforce the model equation strictly that gives rise to strong constraint the model constraint is strong using Lagrangian multiplier. This version of using Lagrangian multiplier when you believe the model is perfect is called strong constraint formulation we utilize this strong constraint formulation in trying to bring uniqueness to the under determine system. If the model on the other hand is not perfect it is pointless to enforce it strictly why we know the model is not perfect. So, why would you even enforce something that you know you are not sure about. So, in this case we still want to respect the model equation, but not strictly, but only approximately this ability to require the model equation to be satisfied not perfectly, but very closely is the concept of weak constraint formulation strong constraint weak constraint. Strong constraint is intimately related to strong constraint formulations are formulated as Lagrangian multiplier problem weak constraint problem are formulated as penalty function which we have already seen in the optimization module. So, I am now going to quickly illustrate a version of the strong constraint formulation let z is equal to hfx I am considering the under determine case assume h is a full rank recall there are infinitely many solutions we seek the unique solution that minimizes the following cost functional. So, model is a constraint among all the solutions in the model I want to find that solution that minimizes the cost function the model has infinitely many solutions among the infinitely many solutions I am seeking a solution that minimizes this cost function. So, this cost function the generalization of the cost function that we have utilized already in the analysis of under determine system in the under determine case what it is that we did we would like to be able to minimum find x whose norm is minimum, but in here I would like to be able to find an x such that it minimizes jfx which is a general quadratic function. A strong constraint formulation for this a strong constraint formulation is to build a Lagrangian the Lagrangian is jfx plus lambda transpose z minus hfx I am now computing the derivative or I am so the gradient of L with respect to x and lambda I get two sets of I get two sets of equations I get two sets of equations I simultaneously solve these two systems I get the solution lambda s s refers to strong solution x f s refers to strong solution the strong solution both lambda and x f s are given by the equation 12 which can be easily solved by which can easily obtained by solving 10 and 11 by setting b is equal to 0 c is equal to 0 and a is equal to I we get the well known solution for the under determine case which we have already seen. So, you can see this is a kind of a generalization of what we have done in the under determine case. Now let us consider the weak constraint formulation in this case I am going to build a penalty function pfx that is equal to jfx plus alpha by 2 z minus hfx transpose z minus hfx the necessary condition for a minimum is given by the gradient of pf alpha must be 0 the gradient of alpha for this is given by the right hand side equate this to 0 and solving this you get the solution x 1 x is equal to x 1 alpha plus x 2 alpha x 1 alpha has taken this form x 2 alpha has taken this form. So, the solution has two components both of which depends on alpha the sum of 17 and 18 is 16. So, 16 is the solution that minimizes the penalty function earlier we talked about the relation between weak solution and the strong solution I know the form of the weak solution I know the form of the strong solution. We also saw the weak solution tends to the song solution as the penalty parameter alpha tends to infinity. Now I am going to show that result in here to this end we use the Sherman-Marrison Woodbury formula if I use the Sherman-Marrison Woodbury formula you can see the relation on the left hand side this inverse applying the Sherman-Marrison Woodbury formula becomes this this inverse applying the Sherman-Marrison Woodbury formula becomes this we have already talked about the Sherman-Marrison Woodbury formula we also have given a proof of the Sherman-Woodbury formula the section on matrices. So multiplying both sides of 20 on the left by epsilon x transpose and simplifying in fact I would like to refer the reader to details in chapter 17 of the book by Louis Laxmere Achanandal the our textbook we obtain the matrix identity which is given by this relation 21 I know there is a lot of computational and checking is there but I am sure I am checking is a homework problem I am hitting on all the major concepts once you have the major concepts you can really follow these to be able to verify by computing various things now by setting these we can readily see this formula is given by this we can then we can see x1 star is the limit of x1 alpha which becomes this. So look at this now when alpha goes to infinity the right hand side does not have any dependence on alpha. So this is the limit of one of the components of the solution x1 alpha in the limit likewise you can compute the solution x2 alpha and using this identity again this matrix identity we have already described using this matrix identity which becomes this as alpha tends to infinity this becomes this this is independent of alpha in the limit therefore x2 star is equal to the limit of x2 alpha which is given by that therefore x1 star plus x2 star is given by this equation if you simplify that that becomes the strong solution as is in 12. So we have demonstrated that the weak solution in the limit as alpha tends to infinity becomes a strong solution in other words as the penalty parameter alpha increases with that band the weak solution converges to the strong solution. So where do I use the penalty function formulation you believe in the model but you also know the model is not perfect so whenever you have to use the model as a constraint use the weak constraint you know the model you assume the model is perfect in that case you assume I am sorry you use the model as a strong constraint. So which method do you use that depends on how strongly you believe on the goodness of the model on the goodness of the model. Therefore weak we have now described how to take care of many very many special cases. So I would like to summarize by observing the following the linear static deterministic inverse problem can be broadly divided into well posed and ill posed well posed problems are very straight forward ill posed problems in principle can cause headache. There are multiple ways of solving ill posed problem one way would be to use tycoon of regularization as we have talked about the second way would be to use a singular value decomposition we have not done that we will wait until that is done to revisit this issue. We also then incorporated the notion of perfect and imperfect models. So you can now consider quite a variety of situation model is perfect model is static model is perfect imperfect model is deterministic model is linear. So you have linearity perfectness or imperfectness you have well posed or ill posed over determined or under determined. So you can consider a quite a variety of formulation of the even simple linearly square problem that brings about the beauty that underlie the notion of simple linear least square methods in the context of linear inverse problems. I would like to very strongly recommend that you all work out exercises these are simple extensions of the methods that we have solved I am particularly referring to example 7.3 in here I am considering a matrix whose columns are 111, 111 plus epsilon when epsilon is 0 the two columns are linearly dependent when epsilon is not 0 mathematically it is linearly independent. Therefore the Eigen values of h transpose h are functions of epsilon. So what is that I would like you to do h is given by this epsilon is greater than 0 epsilon is small compute h transpose h compute the Eigen values of h transpose h plot the variation of Eigen value using a mat lab as epsilon varies from minus 1 to plus 1. So you can see what is the impact of the rank. So when epsilon is 0 the rank this is the rank deficient when epsilon is not 0 it is not it is strictly rank 2. So but when epsilon is small even though mathematically it is rank 2 it is it can still cause problems and how it creates problems one can understand by computing the Eigen values. Once you compute the Eigen values you remember we can compute the condition number. Therefore by computing the condition number for h transpose h we can infer how ill posed or how well posed the problems are. So that is the measure by which we can compute the degree of ill posedness. So this problems 7.3 is an important exercise and there are few other problems which are routine which follow the directions of the development. And this module follows from chapter 5 and the following report. We recently computed a report in 2014 on the convergence of a class of week solution to the strong solution of an equality constraint minimization problem. I am sorry this must be we must remove the point there. A direct proof using matrix identities is a technical report from the School of Computer Science University of Oklahoma. With this I think we have provided a broad overview of the richness of the linearly square problem static version thereof. Thank you.