 Hi, we are learning methods for computing Eigen values and Eigen vectors. In the last class, we have introduced power method. In this lecture, we will learn the convergence theorem for power method. Let us quickly recall the iterative procedure for power method. We first have to choose an initial vector x naught arbitrarily and then for k equal to 0, 1, 2 and so on. The power method generates two sequences. One is a sequence of real numbers and another one is a sequence of vectors and they are defined as mu k plus 1 is equal to y i k plus 1 and x k plus 1 is equal to y k plus 1 divided by mu k plus 1, where y is a vector given by A into x k that is y k plus 1. The vector y in the present sequence is obtained by multiplying A with the vector x computed in the previous sequence. Therefore, it is a iterative procedure. Once you get y, you will find that coordinate at which the maximum norm is achieved then you take the minimum of such index and take that value and consider that as mu k plus 1. Once you have mu k plus 1, you will compute x k plus 1 as y k plus 1 divided by mu k plus 1. This will go for all k equal to 0, 1, 2 and so on. So, this is the iterative procedure for power method. We have also seen in the last class an example where power method was successfully applied and we have computed 10 iterations and saw that the power method sequences are seems to be converging to the dominant Eigen value lambda 1 which happens to be 3 in this example and also a corresponding Eigen vector of the dominant Eigen value lambda 1. Let us take another example where power method is not successful. Now, we will take a matrix B which is given like this. The Eigen values of B are 1 minus 2 and 2. You can see that B has 2 as the dominant Eigen value, but it does not have the unique dominant Eigen value because 2 and minus 2 both will contribute as dominant Eigen values right. Let us use power method to compute the iterations mu k and x k and see how these sequences are coming out. For that we have to choose an initial guess. Again let us take it arbitrarily as 1, 1, 1. The first iteration happens to be 8, 3 and minus 2 as the y vector and you can see that the maximum norm of y is attained at the first coordinate of y and therefore, that is taken as the value for mu 1. Once you have mu 1 x 1 is computed like this. Recall we are supposed to converge to either 2 or minus 2. Let us see what is happening. The second iteration happens to be like this. Let us just keep on observing how the sequence mu is coming out. In the third iteration mu 3 comes out to be 4.111 and so on and in the fourth iteration it is 1.11. You can observe that in the second iteration it is something like 1.1 and so on. In the third iteration it is 4.1 so on. Again in the fourth iteration it is 1.1 and so on. Like that it keeps on jumping between something around 1 to something around 4 and I kept on going till 997th iteration I went. I still see that mu is jumping between around 3.6 to 1.1. It never seems to be converging to the dominant eigenvalue either 2 or minus 2. It kept on jumping between these two numbers. You can see in the 99th iteration again it jumped from 1.103 to again 3.6. Again at the thousandth iteration it did like this. Then I went for quite some more iterations. I observed that the sequence mu kept on jumping between 3.6 so on to 1.1 so on. So that gives us a feeling that the power method for this example that is the matrix B given in this example seems to be not converging to the dominant eigenvalue and therefore and not to a corresponding eigenvector also. That gives us a interesting question of when can we expect the convergence of these sequences. For that let us state this theorem. We need the following hypothesis. You are given a matrix A which is a n cross n matrix. Suppose A satisfies the following hypothesis. Hypothesis 1 is that A has a unique dominant eigenvalue lambda 1. See these are the conditions that we have already spoken in the last class. We now know why we are imposing this condition. This is very clear from the way we have constructed the sequences mu k and x k right. Therefore, you can understand why this hypothesis are part of the convergence theorem. The next hypothesis is that A has n linearly independent eigenvectors V i. That is a set of eigenvectors V 1, V 2 up to V n should form a basis for R n. Again if you recall from the last class why this hypothesis is needed because when we choose an arbitrary vector as a initial guess we started with its representation in terms of the eigen basis and then from there we recursively pre multiplied A and that is how we constructed the sequences right. Therefore, this condition is also important for us in the construction of the sequences in power method. The third condition is that the initial guess that we choose should satisfy the condition that the initial guess should always be away from the kernel of A k for every k equal to 1, 2 up to infinity and also when you write x naught as the scalar multiple of the eigenvectors then the first scalar that is c 1 should be not equal to 0. If you recall because the sequence that is converging to a scalar multiple of the eigenvector is actually multiplied with c 1 right. If c 1 is 0 then the first term in our expression will vanish and therefore, your sequence will never converge to the dominant eigenvalue and its corresponding eigenvector right. Therefore, c 1 should not be equal to 0 for the initial guess that we have chosen. If all these hypothesis are satisfied then we can say that the sequence mu k converges to the dominant eigenvalue lambda 1 of the matrix A and the sequence x k well it may not converge as a full sequence what we can conclude is we can find a subsequence of x k that is very important you see you can find the subsequence of the sequence x k that converges to an eigenvector of lambda 1. So, that is very important here. Let us try to prove this theorem. If you recall the definition of the sequence x k is given by x k plus 1 is equal to y k plus 1 right divided by mu k plus 1 right. What is y k plus 1 that is nothing, but A into x k that is the value of x in the previous iteration that is what I am writing here x k plus 1 is equal to instead of y k plus 1 I have written A x k. Now, you see x k is nothing, but y k divided by mu k right that is what I am writing here instead of x k I am just writing A into y k. Now what is y k y k is nothing, but A into x k minus 1. So, therefore, I am replacing y k by A x k minus 1 and that can be simply written as A square into x k minus 1 divided by mu k plus 1 into mu k right. Therefore, what we see that x k plus 1 can be written like this. Now once you understand this step you can apply the same idea to this as A cube x k minus 2 and that will leave one more mu term here that is mu k plus 1 mu k and mu k minus 1 right. Like that you keep on going how long you can go till you reach x naught because that was the starting point of our iteration. When you reach x naught you will have A k plus 1 here and in the denominator you will have the products mu k plus 1 mu k and so on up to mu 1 right. Therefore, x k plus 1 can be written as I will just take 1 by this product as m k plus 1. Therefore, x k plus 1 is equal to m k plus 1 into A k plus 1 times x naught that is very clear right. So, this is what we obtained the sequence x k plus 1 is finally, written like this in terms of x naught. Well let us recall x naught is something we chose arbitrarily and since our set of eigenvectors form a basis for r n and x naught is a vector in r n you can write x naught as a linear combination of the eigenvectors. And our hypothesis also says that that x naught is such that c 1 is not equal to 0. Now, what we will do we have this quantity right and A k plus 1 into sigma i equal to 1 to n c i b i right. Now, you take this A k inside and you can see that A k plus 1 into b i is nothing, but lambda k plus 1 into b i right. This kind of calculations are already done in our last class. Therefore, it should not be difficult for you to understand how we got this expression. Now, what you got is from here you have each term involving something like this lambda i into b i right and then you will remove lambda 1 from the first term and you will divide lambda 1 in all the other terms. So, this is how in the last class we have seen that we have constructed the sequences for power method. The same idea I am just putting one more time as a proof of this theorem we are not doing anything new here. Now, what you do is if you observe the way x k plus 1 is defined you can see that it is a unit vector with respect to the maximum norm y because x k plus 1 is nothing, but y k plus 1 right divided by mu k. What is mu k? Mu k is nothing, but the value of that coordinate of y at which the maximum norm is achieved right. Therefore, if you take modulus on both sides you will have this and this mod and that is precisely norm of y k plus 1 right. Of course, everywhere it is a infinite norm that we are taking therefore, this gets cancelled and you will have 1 here. So, x k as we defined is a unit vector with respect to the maximum norm or L infinity norm. Therefore, if I take modulus that is norm on both sides I will have this into norm of this vector everywhere we are taking the maximum norm ok. So, this is equal to 1 that is what we have written here. Therefore, you can see that this term is equal to 1 by this term right it will come in the denominator on the left hand side. Now, take limit k tends to infinity in this term. So, that gives you limit k tends to infinity mod m k plus 1 into lambda 1 k plus 1 is equal to when you take the limit you can see that all these terms are in the absolute sense less than 1 ok. It means they lie between minus 1 and 1. Therefore, if you take the power of that number k plus 1 and then tend that k to infinity you will see that this term will go to 0 and you will have mod c into b 1 norm and that is taken on the other side therefore, 1 by mod c into infinite norm of v 1 ok and that is a finite number. Therefore, if you take limit k tends to infinity of x k plus 1 if you recall how x k plus 1 was defined x k plus 1 was defined like this right. I am taking now limit k tends to infinity for x k plus 1 and that is nothing, but m k plus 1 lambda 1 k plus 1 into c 1 v 1 right because this term goes to 0 that is what we are writing here limit k tends to infinity x k plus 1 equal to limit k tends to infinity m k plus 1 lambda 1 to the power of k plus 1 into c 1 v 1. We know what is mod of this quantity right this is anyway independent of k therefore, you just can see what is this is nothing, but this now if I remove this mod what can I say this term will tend to either plus 1 by mod c 1 into norm v 1 infinity or it will tend to minus of this or it may oscillate between this two. Therefore, you can see that limit k tends to infinity x to the power of k plus 1 will be either v 1 remember c 1 is already there in the denominator therefore, that and this c 1 will get cancelled that is this c 1 and this c 1 will get cancelled leaving either a plus sign or minus sign there ok. So, therefore, this entire limit will either be plus v 1 into norm v 1 or minus v 1 into norm v 1 or it may simply oscillate between these two also sometimes this may also happen that this sequence may oscillate because only in the modulus it is converging to this number. Therefore, the sequence as such may even oscillate right. So, we cannot ignore that case that is why if you recall in our statement we have only given guarantee that the sequence may not converge well only a subsequence can converge because this sequence may become a oscillating sequence as we have seen here ok because it is coming as limit k tends to infinity m k plus 1 lambda 1 k plus 1. We only know its absolute convergence therefore, we cannot conclude that it converges it may oscillate out. So, that completes the proof of the conclusion 2 because the conclusion 2 says that we can always find a subsequence that converges well if this happens then the subsequence is the full sequence itself. Similarly, if this happens then also the subsequence is the full sequence itself if this happens then it means that the sequence is oscillating between these two numbers. Therefore, you can choose those terms which involves this. So, that the sequence will converge to this one or you may choose those terms which has this value then that subsequence will converge to this one that is why we can only give you guarantee up to convergence of a subsequence here ok. So, we have proved conclusion 2. Now, let us try to prove the conclusion 1 which is about the convergence of the sequence of real numbers mu k right. So, let us see how to prove that mu k converges to lambda 1. For that we first note that y k plus 1 is given by a into x k this is how we have defined our sequence y k plus 1. Therefore, taking limit on both sides will give us limit k tends to infinity a x k is equal to a into limit k tends to infinity x k right. Now, we know this how it looks like we have already derived that right limit k tends to infinity x k well whether it is x k plus 1 or x k does not matter because we are taking k tends to infinity. Therefore, limit k tends to infinity x k is actually behaving like this one of these 3 cases will come that is what we are having here. Therefore, your limit k tends to infinity a x k can be written as k into lambda v 1 why because it is nothing, but k into v 1 what is k k is nothing, but this one 1 by this that is what I am just denoting here by k k is nothing, but 1 by norm v 1 it can be plus or minus that is what we have written here k into v 1 and then you are applying a on it. So, k will come out and you will have a v 1 and that is nothing, but k times lambda 1 v 1. So, that is what we are writing here. Now, let us see what happens you know that this sequence is converging to this one well up to a subsequence because it may be simply oscillating between plus and minus also that case should not be ignored. Therefore, we will only say that this sequence is converging to this limit whether it is plus or minus up to a subsequence. Now, what you do is you take one typical coordinate of this vector v 1. Let us choose that as the j th coordinate which is non zero that is more important then you choose the same coordinate here in each of the terms in the sequence y k and you can see that that sequence will converge to k lambda 1 and this is the vector v 1 and we are choosing the j th component of that vector. Therefore, it will converge to v 1 the j th component right that is what we are writing here and y j k plus 1 converges to this real number as k tends to infinity ok up to a subsequence that we should always keep in mind I may just loosely say converges, but it is up to a subsequence right. So, there exist an integer n greater than 0 such that you can say that y j k plus 1 is not equal to 0 for sufficiently large k. Why because we have chosen that coordinate of v 1 to be non zero. How can we do that? Because we surely know that v 1 is a non zero vector right because it is nothing, but an eigen vector of lambda 1 therefore, it is a non zero vector. So, there is surely one coordinate of v 1 which is non zero. Now, you see that this sequence is converging to that non zero value right. Therefore, for sufficiently large k this sequence will have non zero values therefore, your x j will also be non zero for sufficiently large k's why because x j k plus 1 is nothing, but y j k plus 1 divided by mu k plus 1 right. Mu is the value of that coordinate of y at which the maximum is achieved and this is some coordinate of y which is non zero. Therefore, the j th coordinate of x vector will also be non zero at least after sufficiently large terms. So, we obtained up to here. Now, you see what we can do with that well if you recall x k plus 1 is nothing, but y k plus 1 divided by mu k plus 1 right. By this time we have understood it very clearly. Now, what I am doing is I am simply bringing mu k plus 1 on the right hand side and writing this equation that gives me mu k plus 1 is equal to what I chose the j th coordinate because this is a vector this is a vector right. I am just picking the j th coordinate of these two vectors then that j th coordinate will also satisfy the same equation. Therefore, I can write mu k plus 1 as y j k plus 1 divided by x j k plus 1 right that is what I am writing here and this is just the definition of y k plus 1 and since I am only taking the j th coordinate I am putting here j and similarly this I am just retaining as it is right. So, we obtained this expression for one typical coordinate of the vector y and x right. With this we will take the limit as k tends to infinity we will get limit k tends to infinity mu k plus 1 is equal to if you recall I am taking limit both for numerator as well as for the denominator. For numerator we already know how this limit looks like this limit looks like this right A x k vector is k lambda 1 v 1. Therefore, it is j th coordinate is the j th coordinate of the vector v 1 and that is what I am writing here the same without multiplying A will have k times v 1 that we have proved here right. So, we have derived that part in this step therefore, combining those two information you can write limit k tends to infinity mu k plus 1 is equal to k into A v 1 j th coordinate divided by k into v 1 j ok. Now, k gets cancelled this is nothing, but lambda 1 v 1 and for v 1 it is only the j th coordinate because I am only taking the j th coordinate of that vector right. And so again the j th coordinate of v 1 will get cancelled with the j th coordinate of v 1 in the denominator that leaves us only with lambda 1. Therefore, this sequence is converging to the dominant Eigen value of the matrix A and that completes the proof of the convergence theorem for power method. Well, there are some serious disadvantages of the power method what are they the power method requires at the beginning that the matrix has only one dominant Eigen value that is the dominant Eigen value should be unique right, but when we have the matrix we do not know any information about the Eigen values and the Eigen vectors of that matrix. In fact that is why we are going for the numerical method. Therefore, knowing this information is impossible and second disadvantage is that when you choose an initial guess we see that in its representation the first scalar C 1 should be not equal to 0 right. This is also something which we do not know when we are choosing the initial guess simply because we do not know the Eigen vectors right. Therefore, we absolutely have no way to check the hypothesis of the power method. Therefore, applying power method to any matrix is more or less a blind work computationally. So, this is something which we have to keep in mind and we have also seen in two examples where power method may not converge always. Now we can at least see when it converges when it does not converge especially when we have more information about the matrix right. Let us try to see how power method behaves when it violates one or the other hypothesis that are given in the theorem. Let us take the first hypothesis if you recall the first hypothesis of the theorem says that the matrix should have unique dominant Eigen value. Now the question is what happens if the matrix does not have unique Eigen value? Well we have already illustrated it at the beginning of this class. We have taken a matrix B where the Eigen values of B are 1, minus 2 and 2. Therefore, although the matrix B had distinct Eigen values its dominant Eigen values is not unique and we have seen at least up to 1000 iterations that the power method sequence was not going close to the Eigen value or the corresponding Eigen vector. It kept on oscillating between two numbers. Therefore, that gives us a feeling that the power method was not converging in that particular example. From there we can say that when the dominant Eigen values is not unique for a matrix then power method may not converge right. Let us take hypothesis 2. The hypothesis 2 says that a set of Eigen vectors should form a basis for R n. Now if the dominant Eigen value is not simple what happens if the dominant Eigen value is unique but repeated. Let the algebraic multiplicity of the dominant Eigen value is say R greater than 1. If the geometric multiplicity is also R then the power method still converges. However, if the geometric multiplicity is not equal to the algebraic multiplicity then the power method may not work. This is just a remark. Now let us go to the hypothesis 3. The hypothesis says importantly that when you represent your initial guess as the scalar multiple of the Eigen vectors then the scalar in the first term that is c 1 should not be equal to 0. That is an important condition that we have imposed which we have no way to check when we are applying power method to any matrix unless we know all the Eigen vectors right. That is something which is too much to expect because if we know all this information why are we going for power method. Therefore, practically these are not something which we have to or we can check. This is just for the understanding purpose we are now discussing what happens if c 1 is equal to 0 for that initial guess x naught that we have chosen. Let us see what happens let us take this nice matrix A why it is nice because it has unique dominant Eigen value. Now let us take this initial guess that is x naught equal to 0.5 and 0.25. Now see what is going to happen at the iteration 1 we have like this just keep watching mu 1 that will give us a clear idea of where we are converging. Then the second iteration gives immediately jump to 1 which is exactly the second dominant Eigen value of the matrix A then iteration 3 is also the same iteration 4 and so on. Now you can see that once it gets stuck with this it is not going to move anywhere. Therefore, you can conclude that the power method in this particular case that is for the matrix A with the initial guess as 0.5 and 0.25 it converge to the second dominant Eigen value. Why it converge to second dominant Eigen value because you can see that in this representation when you take c 1 b 1 plus c 2 b 2 plus c 3 b 3 is equal to x naught that is what we are writing. So, what is this c 1 b 2 plus b 1 is 1 0 2 plus c 2 into 0 2 minus 5 plus c 3 into 0 1 minus 3 is equal to 0.5 and 0.25. From here you can clearly see that c 1 is 0 that means in all our derivation if you go to our previous lecture and recall when we keep on multiplying A with the vector x naught now not b x naught you will have this expression in that now you see this happen to be 0. Therefore, who is dominating here the second term onwards is what is remaining in this expression. Therefore, the immediate second dominant Eigen value is taking the position of your dominant Eigen value and you can write that expression when c 1 is 0 you can write it as lambda 2 k into c 2 v 2 that is the role of c 1 v 1 is now replaced by c 2 v 2 plus c 3 onwards you have. Now, when you take k tending to infinity all this terms are going to 0 and you are left out with the term with lambda 2 that is why the sequence is converging to this second dominant Eigen value right. It is not because the first coordinate of the x naught is 0. So, this is not going to decide where your sequence mu will converge remember it is because in that representation your c 1 was 0 that is why the sequence converge to the second dominant Eigen value. So, often students make mistake they see the first coordinate of the initial guess 0 immediately they will conclude that the sequence will converge to the second dominant Eigen value. No you have to write this system and solve it for c 1 c 2 c 3 and check whether c 1 is 0. If c 1 is 0 then if the second dominant Eigen value is unique then the sequence will converge to the second dominant Eigen value. Similarly, if c 1 and c 2 both are 0 and if the third dominant Eigen value is unique then the sequence will converge to the third dominant Eigen value and so on ok. So, this is very important for us to remember. So, in this case c 1 happened to be 0 that is why the sequence was converging to the second dominant Eigen value lambda 2. So, with this we will stop our discussion on power method in this class. We will continue our discussion in the next class. Thanks for your attention.