 Okay, so the last time we were looking at errors in inverses and solutions to linear systems, a related concept that we saw was the concept of compatible norms. And compatible norms are norms such that they satisfy a submultiplicativity type property, but with a combination of vector norm and a matrix norm. So specifically, if there is a vector norm such that Ax is less than or equal to the norm of A times the vector norm of x, then this vector norm and this matrix norm are said to be compatible. Okay, so yeah, so based on this, we were looking at bounding the errors in computing solutions to linear systems or equations, we saw one formulation and towards the end of the previous class, we saw this other formulation where we look at a perturbed system, this one here. So where the matrix A and the right hand side B are both perturbed by matrix A delta and a vector B delta multiplied by some small number epsilon. And we are interested in understanding the first order behavior of how the solution x epsilon is related to the solution x as this for very small values of epsilon. And so this is what we call the perturbed system. And we showed that if you look at the relative error in x due to perturbations, that is the norm of x epsilon minus x divided by the norm of x, that can be written as that can be upper bounded by epsilon times the condition number of A with respect to some norm, which is actually a compatible norm with the norm used to evaluate this relative error in x. So this times the relative error in B actually epsilon times this is the relative error in B and epsilon times norm of A delta over A is the relative error in A plus a term which is order of epsilon squared. So if epsilon is small enough, this term will dominate this term and so you can drop this term. And essentially what we said is if yes, as I mentioned, the relative error in B is epsilon times norm of B delta divided by norm of B, we will call this rho B and similarly rho A is epsilon times norm of A delta over norm of A. Then we can upper bound this relative error in x as the condition number times rho A plus order of epsilon squared. So this is the punchline that to a first order approximation, the relative error in the computed solution x of epsilon is bounded by the condition number of A times the sum of the two relative errors. So that is where we stopped in the previous class and again it brings out the fact that if you have a well conditioned matrix, k of A will be close to 1 and so the error in the solution is going to be of the same order as the error in A or B. Whereas if A is a poorly conditioned matrix, k of A will be a large number. So the error in the solution will be much larger than the error in either A or B. This is the bound we have on this. So again, it's possible that for specific A's and specific right hand sides B, the relative error in the solution may not be as big as this. It doesn't mean that it will always be as big as this, but this is the bound we are able to get. Okay, so let's continue. So now that concludes this chapter on norms. So now I'm moving to the question. Yeah. Sir, now that we derived this thing for the same epsilon for matrix A and B, that is same perturbation parameter. So if the epsilon is different from A and B, will the finite formula that relative error plus B plus relative error in A times condition number will hold? So these are all just upper bounds. Okay, so if you want to perturb them by different amounts, a simple fix to this is to take epsilon to be the max of the two epsilons. And everything we're saying here is valid. Here we're just looking at the first order behavior, the sensitivity to small perturbations in A or B. And the punchline is again that whatever is the relative error in B plus the relative error in A, that gets amplified by this coefficient K of E. Yes. Okay, so we'll continue. So now I come to actually chapter one of the Haun and Johnson textbook, which is on eigenvalues and eigenvectors. So yeah. Yeah, can you scroll up a little bit? Yeah, no scroll up in the first page. Yeah. Sir, again, somewhere you have written, sir, is this see, is this normal matrix number A what you've written? Good question. This should have been the matrix. Yeah. Sorry, thanks. The typo. This is the same matrix form that is used here to compute the condition number of A. Okay. So now we come to sort of have to rewire your brain a little bit and this is a different topic. It's eigenvalues and eigenvectors. Again, a super important topic from the point of view of matrix theory. So the fundamental equation of eigenvalues and eigenvectors is very simple. It's stated right here, Ax equals lambda x. And here A is an n cross n matrix. And x is an n by 1 vector. Lambda is a scalar. Okay. And so lambda belongs to C. And x can be any vector as long as it's not the all zero vector. Of course, the all zero vector will satisfy this equation, but we don't care about that. That's a trivial solution. So for nonzero x, if you can find a vector x and a scalar lambda such that Ax equals lambda x, we call this pair x and lambda as an eigenvalue. Okay, this is not correct to be sort of consistent. This is an eigenvector, eigenvalue pair. Okay. The key thing, the reason I underlined this here is because these always occur in pairs. Okay, so you associate an eigenvalue and associated with any eigenvalue. I mean, you cannot define an eigenvalue without saying that there is an x not equal to zero, such that Ax equals lambda x. And similarly, you can't say that x is an eigenvector without saying that for some complex valued lambda, Ax equals lambda x holds. So they are always in pairs. Okay, so just to motivate, there are two quick simple examples where eigenvalues and eigenvectors matter. So suppose you want to find a solution to this differential equation here, du over dt equals au. Here, a is some constant matrix, which is independent of t, but u is a function of t and you are trying to solve for u of t. u of t is a vector and it is evolving with time. And the way it evolves is such that it satisfies this differential equation du by dt equals au. Of course, in the scalar case, you have certainly seen this in your undergraduate program. If I give you this equation du by dt equals au, you will take u to down there and dt up here and then you will integrate both sides. You will get log u is equal to at and from that, you get u is equal to some constant times e power at, where the value of the constant depends on the initial condition. That is the value of u at t equals zero. So if somebody tells you what the value of u at t equals zero is, you can find what this constant is and you know that this is the solution. So if I had this and say what happens in the matrix case, I could potentially think about writing capital A in the exponential here. But for now, just consider u is equal to e power lambda t times x, where x is an eigenvalue of A and lambda is an eigenvalue of A and x is an eigenvector of A. If I make this substitution, u is equal to e power lambda t times x. If I substitute that, yeah, so how do I explain this to you? So A times u will be A times e power lambda t times x, which is equal to e power lambda t is a scalar. So I can take that out of the multiplication. So it is e power lambda t times A x and A x is the same as lambda x because lambda and x are an eigenvalue eigenvector pair. So A times u is equal to lambda times e power lambda t times x. So basically, A u will be equal to lambda times u. And then if I consider du by dt, so if I differentiate this with respect to t, the only thing that depends on t is just this e power lambda t and its derivative is lambda e power lambda t. And so du by dt is also equal to lambda times e power lambda t times x, which is equal to lambda times u. So A u is lambda times u and du by dt is also equal to lambda times u. So it satisfies this differential equation. And more generally, u can be written as a linear combination of solutions of this form corresponding to different eigenvalues and eigenvectors. Now, another problem is, suppose you want to solve a constrained optimization problem, such as maximize x transpose A x subject to the constrained x transpose x equals 1, where A is a real valued matrix, which is also symmetric. So A equals A transpose. Then the conventional approach is to use the method of Lagrange multipliers, where you define this Lagrangian function, L, which is A x transpose A x minus lambda times x transpose x. Then if we differentiate this with respect to x, now this is a vector derivative. So you will have to take this on faith, but the simple explanation is that the way to differentiate with respect to a vector is to differentiate with respect to each of the components of the vector partially and then stack them together as a vector. The derivative of a scalar with respect to a vector is a vector whose dimension equals the dimension of the vector. And the entries are equal to the derivatives of the scalar with respect to each of the components stacked one above the other. And if you do that for this particular Lagrangian function, it is not difficult to show that the derivative is 2 times A x minus lambda x. And so if you set the derivative equal to 0, you get this equation here, 2 times A x minus lambda x equals 0. Or in other words, A x equals lambda x, which is the eigenvalue, eigenvector equation. So these are two simple examples where eigenvalues and eigenvectors arise naturally and you are trying to solve some problems. And here is an example to just visualize eigenvalues and eigenvectors. So suppose A is the simple 2 cross 2 matrix with entries 4, 1, 1, 4. If I take x1 to be this vector 1, 0, then A x1 will be the first column of this matrix, which is 4, 1. So here I show that in red. So x1 is the vector 1, 0. And A x1 is going along this direction. The x component is 4 and the y component is 1. And similarly, if I take x2 to be 0, 1, then A x2 is the second column of this matrix, which is 1, 4. And that is shown in green here. A x2 is in this. So you see that x1 and x2, sorry, x1 and A x1 point in different directions. x2 is like this. A x2 is pointing in a different direction. Whereas if I choose x3 to be 1, 1, then when I do A x3, I get 5, 5, which is 5 times this vector x3. So it points in the same direction as x3. So that is shown in black here. So similarly, if I take x4 is 1 minus 1, then A x4 will be 3 minus 3, which is also pointing in the same direction, which is 3 times 1 minus 1. So these eigenvectors are very special vectors, where when you multiply the matrix by the eigenvector, you get a vector that is pointing in the same direction as the original vector. So how do we find these eigenvalues? I guess you guys know this already. So just quickly for the sake of completeness, discuss this. So consider A to be an n cross n matrix. It could be complex also, the same thing, whatever I am going to say holds for complex also. So if we consider the equation A x equals lambda x, this implies that A minus lambda times the identity matrix times this vector x equals 0. And this kind of an equation where you have a matrix times a vector equals 0. This is called a homogenous equation. The right hand side is 0, that is when it is called a homogenous equation. So one thing is that we wanted these eigenvectors to be non-zero vectors. Of course, if I set x equals 0, this will satisfy this equation, but we don't want that solution. So if x should be non-zero, that means that this A minus lambda i must have a non-trivial null space. In other words, lambda is such that A minus lambda i is singular. So at this point, it should look a little magical to you. So we see lambda i is a highly structured matrix. It is just a scaled version of the identity matrix. By subtracting lambda times i from A, I am able to arrive, no matter what A is, if I take the right scaling lambda here and do A minus lambda i, the columns of A minus lambda i should become linearly dependent. And then this matrix should become singular. So I am looking for such kind of lambdas. And if I want this matrix to be singular, one way to test it is to find its determinant. And whenever the determinant of this matrix goes to 0, we know that this A minus lambda i is going to be rank deficient and this matrix will be singular. So the determinant gives us a test. So lambda is an eigenvalue if and only if determinant of A minus lambda i equals 0. So this is a very crucial observation that lambda is an eigenvalue if and only if this determinant of A minus lambda i goes to 0. Obviously, if determinant of A minus lambda i is 0, then it means that A minus lambda i is singular and therefore you will be able to find a nonzero vector x such that A minus lambda i times x is equal to 0. Contrary wise, if lambda is indeed an eigenvalue of A, it implies from the definition that there is a nonzero x such that A x equals lambda x or A minus lambda i times x equals 0 for some x not equal to 0, which means that the matrix A must be singular. So and therefore its determinant must be equal to 0. So these two statements, these two points are actually an if and only if statement. A lambda is an eigenvalue of A and determinant of A minus lambda i equals 0 are if and only if conditions. And this equation determinant of lambda A minus lambda i equals 0 is called what? The tangent. Yes, it is called the characteristic equation. It is a polynomial of degree n. That comes about if you simply expand this definition from the definition of determinant, you will see that this will be a polynomial of degree n. And so and also corresponding to any eigenvalue lambda, there will always exist at least one nonzero eigenvector by definition. They always occur in pairs, I am repeating my point. So that is it. So this is how we find the eigenvalues, we have to set the characteristic equation or find the solutions or roots of the characteristic equation and that gives us all the eigenvalues of the matrix. So again for example, if I consider the matrix 4, 2, minus 5 and minus 3, then if I consider determinant of A minus lambda i equals, so let us see determinant of, so this is my A. So let us say A minus lambda i is the determinant of 4 minus lambda minus 5, 2 and minus 3 minus lambda, which is equal to 4 minus lambda minus 3 minus lambda plus 10. And if I set this equal to 0, then I will get, so you have to simplify this so that it will give you lambda minus lambda times lambda is lambda square and 3 lambda minus 4 lambda gives me minus lambda. And then I have minus 12 over here, but there is a plus 10, so I am left with minus 2 equals 0. And the solutions to this are lambda equals minus 1 or plus 2. So if I now compute, so these are the two eigenvalues of this matrix. And if I now compute A minus lambda 1 times the identity, so let us call this lambda 1 and let us call this lambda 2. Then this times, if I take x1 to be a corresponding eigenvector, this will be equal to 5 times, now the slight abuse of notation I will call this vector x1, so I will call this x1 x2. And I set this equal to 0 and I want to solve for what x1 x2 satisfies this. It is easy to verify that x1 is the vector 11. If I just take 11 here, this could become 0, this becomes 0. And similarly, if I take A minus lambda 2i times x2, that becomes this minus, so 4 minus 2 is 2 and this is 2 and this is minus 5 and minus 3 minus 2 is minus 5 again. This times say x1 dash x2 dash equals 0, that implies the vector x2 is equal to, I can just take it to be 5, 2. So, notice that basically if I take A minus lambda 1i, that is this matrix, the column of this is 5, 2 and that gives me x2. And similarly, if I do A minus lambda 2i, that gives me this column and that is going to be equal to x1. So, I will just indicate it like this. So, this is something that is an interesting observation that the columns of A minus lambda 2i actually give you x1, the eigenvector corresponding to the first eigenvalue and vice versa. So, this only works for 2 by 2 matrices, it does not work for larger damage 2 matrices. But nonetheless, it is an interesting observation. So, basically when I multiply A with a vector x, most vectors will not satisfy A x equals lambda x, only special numbers are eigenvalues and special vectors are eigenvectors. Normally, if I take A x, it will scale the different components of x by different amounts and it will rotate the vector x and so it will not point in the same direction. The ones that point in the same direction are called eigenvectors and there is a corresponding scaling factor which is denoted by or which is defined to be the eigenvalue. So, this is the basic notion of eigenvalue and eigenvectors and how to find eigenvalues and once you found the eigenvalues, you compute A minus lambda i for each eigenvalue and you find one vector in the non-trivial null space of A minus lambda i and that gives you, or rather you find a basis for the span of the null space of this matrix and that gives you the eigenvectors corresponding to that eigenvalue. So, now there are a couple of more definitions. Yes. There is not any restriction on A, right? I mean A only should be a square matrix, it can be similar as well, still eigenvalue and eigenvector will exist. Yes. So, basically if A is singular, then there is an x which is non-zero such that A x equals 0. But of course, I can write 0 as 0 times x. So, then it satisfies A x equals lambda x where lambda equals 0. So, if A is singular, then certainly you can say that lambda equal to 0 is one of the eigenvalues. Yes.