 So, in the last class, we had a discussion on the difference between classification and clustering in feature space. Now, to implement most of these algorithms or computational methods, we will need many mathematical tools and methods, which are based on concepts of probability, statistics, linear algebra and vector spaces. Of course, there are other methods to do clustering and classification, which are based on neural networks and graph based methods or syntactic analysis as well. Throughout this course, we will be mainly concentrating on statistical methods and of course, we will touch upon a few neural based methods for classification as well. And to understand these methods in a better way, we expect the students to have some background on the mathematical principles of linear algebra, vector spaces, probability and statistics. The concepts of probability and statistics will be covered later in the next class. Today, we will look into basics of linear algebra and vector spaces. For those in the field of mathematics and also computer science and electrical engineering, this may be revision of the basics. And if you are conversant with this topic, then you can actually skip this lecture and go forward to the next lecture itself. For those who are not used to these methods, may be in the recent past, we thought to provide you one lecture on very basics of linear algebra and vector spaces. So, we start off with the concept of what is a matrix. Matrix is an array of numbers. It has applications in many branches of mathematics, computer science and electrical engineering. Using a matrix, you can represent a graph, you can solve a system of linear homogenous equations, you can use it to represent any set of numbers which have some order between them. And of course, we are going to use matrix in our field of pattern recognition also. What is the symbol which is used to represent a matrix? You can represent it a matrix usually by capital letter a or b and also it is a rectangular array of numbers. A matrix is a rectangular array of numbers. A typical notation which you may have is something like this a i j or b i j where very simply these are considered to be the elements of the matrix a or b. The question comes is what is the size or dimension of this matrix? So, you have an array of numbers. So, we are talking of a two dimensional array of numbers and let us say the size of a is m cross n. These are two integers. There are two integers or it can be something like p star q as well. Essentially, it consists of a set of rows and columns. So, the number of rows multiplied by the number of columns of a matrix give the size or number of elements within a matrix. There are two subscripts which are used here which we will be trying to use consistently i and j. So, one of them will represent an index for the column the other for the row. Typically, if I draw the elements of a matrix say a i j or the elements of a matrix I can write them as a 1 1, a 1 2 and a 1 3 and so on up to the last element a 1 n. This is the first row of the matrix a. You can write the elements of the second row as similarly as a 2 1, a 2 2, a 2 3 and so on up to a 2 n very simply and so on for third, fourth row and so on until you reach the mth row where you write them as a m 1, a m 2 and so on up to a m n. The same thing will be applicable for the elements b i j for the matrix b where you just simply replace the character a by b and you will have b 1 1, b 1 2 up to b 1 n, b 2 1, b 2 2 up to b 2 n and similarly b m 1 sorry you have to be careful because the size of the array b is p star q. So, it start from b 1 1, b 1 2 and so on up to b 1 q, b 2 1, b 2 2 up to b 2 q and then finally a b p 1, b p 2 up to b p q. Notation wise sometimes the size of the array is written as a subscript here where you will find a cross or a star indicating that this is the total number of elements in the array or there are so many rows multiplied by so many columns. So, the index i which you see here is actually indicating the number of rows it is like the y axis in an x y graph or a plot and the j indicates the column under consideration. So, arbitrary element a i j here will be talking about the i th row and the j th column. I repeat first, second and so on up to m rows in the matrix a and first, second up to n columns for the matrix a as well. So, the element a i j will correspond to an element at the i th row and the j th column. Similarly, you may have also a b i j as the i th and the j th column element of the matrix b. Of course, in this case you must ensure that i n j does not exceed the value p and q respectively. Of course, there are some notations which might actually start from 0 0 here as the first element and stop at m minus 1 n minus 1 here or p minus 1 q minus 1 here it is only a matter of a convention. So, this is of size p cross q. So, the lower case letters are representing here the position of the element i th row j th column here. The small letters indicate the element of a matrix capital letters is the convention which we will use to represent the entire matrix. Let us take a few examples of matrices a very simple example. Let us say a matrix has, let us talk a matrix a of size say 4 cross 3. What does it indicate? Basically there are 4 rows and 3 columns 4 rows and 3 columns and let us write some elements that is a typical example of a matrix that is a typical example of a matrix which consists of 4 rows and just 3 columns. A matrix can be a special type which can just have let us say I write a matrix b as something like this. Very simply how many columns? Just one column. How many rows? 3 rows. This is also possible because what we are talking of is that the value of m is 3 here the 3 rows and the value of n is 1 here just one column. In this special case of a matrix this is also can be considered to be a vector. We will introduce the word vector little bit later on but for the time being we will see that this can form the elements of a particular vector as well. Of course you can have an array which is having many columns but just one row. A typical example could be another matrix a which is something like this 4, 11, 17 and 5. Basically it contains 4 columns and 1 row. So this is actually sometimes called a row vector. Look at the difference. This has one column. This is one row. Sometimes it is called a column vector or a row vector a special case of matrices. So just introduce what is the matrix and what its elements could be. We are talking about real numbers in this case but remember there are some mathematical analysis where you can have the values to be complex numbers as well. They can exist. You can have complex matrices but we will keep them out of the scope of our discussion today. We may not need that much in the field of pattern recognition. That is the main reason. We restrict our scope of elements aij and bij to be real numbers. There are lot of applications of matrices. Our focus will be to use them in the field of pattern recognition for the purpose of classification, clustering and or suitable analysis which are required for classification and clustering also. For that we need to manipulate the elements of a matrix. There are various operations which are possible with matrices. There are various operations possible with matrices. We will first find a few properties which are associated with the matrices. We will also find some special type of matrices which exist and then go into various manipulations and operations which can exist with matrices. Before we start looking into different properties and operations on matrices, we will get used to a few simple terminologies associated with matrices. So, we had a matrix aij, a11, a12 and a1n. Second row, third row and I am writing the last row here. I write the second row also. Diagonal of a matrix often also called the main diagonal of the matrix or the elements which are in the diagonal. So, if you ask me what are the main diagonal elements here, that is a typical example. a11, a22, a33 and so on. Up to of course, m, we will assume here m is equal to n. We will assume here that m is equal to n. That means the matrix is a square matrix. So, that is another term. A matrix can be either square or non-square. Typically that is a simple thing. If m is not equal to n, the matrix is rectangular or non-square. If the case is m is equal to n, the matrix is a square matrix. We will also define a very simple matrix called the zero matrix. This is diagonal of a matrix, mind you. Diagonal is not a matrix. The set of elements which are falling on the diagonal going from top left to bottom right, not the other diagonal, mind you, is the diagonal of a matrix. These are special matrices. So, we can have a square matrix or a zero matrix. The zero matrix, elements are all zero. A typical example. This can be a set of elements on a single row or on a single column or even a few rows, a set of rows and a column forming where all the elements are zero. You can have a matrix of wants similarly, which is sometimes very casually written as a matrix of wants. A typical example could be elements are all one. Very simple elements. The next matrix is an identity matrix, which is given by the symbol i. This is a special case of what is called as a diagonal matrix where the elements are all at the diagonal, the elements are all equal to one and the non-diagonal elements are equal to zero. Whereas in the case of a diagonal matrix, which is often referred to by the symbol d, it is often written as d 1 1, d 1 2, up to d 2 2, up to d n n, if the matrix is of course square and all the other elements are zero. Instead of writing all the elements as zeroes here, it is often represented by a big zero. Big zero means all the elements which are of diagonal. That means they are not sitting on the diagonal or zero forms a diagonal matrix. This is a very important matrix, which will be used. And a special case of diagonal matrix is an identity matrix, where the diagonal elements d i i are equal to one. This is a case, which happens when diagonal matrix becomes an identity matrix. Let us look at a few operations on matrices. Very simple preliminary operations on matrices. The first of them will be the equality of a matrix. When do you consider two matrices to be equal? So if we say two matrices a and b are equal, only if first of two matrices should be of equal size. That means if this is m cross n, this is also m cross n. And the elements a i j and b i j for the corresponding matrices a as elements a i j, b has b i j should be equal. And this should be true for all i and j. For all values of i and j, this should be true. Similar to operations on any elements in mathematics, you can also have addition and subtraction, which involves basically that if there are two matrices a and b, you can either do an addition or you can also do a subtraction operation. Maybe I will write it below, giving rise to a resultant matrix c or d, depending upon whether you add the elements and what does it involve? Typically the elements c i j of the matrix c will be a i j plus b i j for all elements i and j. Again the two matrices have to be of the same size or order as a and b and if you replace addition by subtraction, you get another matrix d with corresponding elements d i j. Let us look at a few more examples. Scalar multiple of a matrix a. So if you have a matrix a and you want to multiply that with a constant k, which is called a scalar value, you get another matrix a b. What does it involve? The elements of the matrix b, resultant matrix b, all of these elements are very simply k multiplied by a i j, k multiplied by a i j. What is this k? Any number? Actually it can be real or complex. It can be real or complex but of course in this case we are looking at only real numbers. The fourth most important operation, matrix multiplication. Important means among those which we have discussed so far, given two matrices a and b, I want to multiply a and b to get another matrix c and the order is very important here. You cannot multiply all the time. You cannot multiply all the time. The order has to be such that or the size of the matrices have to be such that if the size of this multiplication is talking about a product or a matrix multiplication between a and b and that is why is the reason why I have used a different symbol to indicate the number of elements in a, then b should have something like this n cross b. And if this is so, this has to be there. That means the number of columns in a has to be equal to the number of rows in b. I repeat it is possible only to multiply two matrices when the number of columns in a is equal to the number of rows in b. It is possible to multiply two matrices and get a resultant matrix c for which the size will be very simply m cross b. m cross n is the size of a, m cross b the size of b resultant size will be m cross b. The other thing which you must remember of course what is the size, what are the elements of c? What are the elements of c? Before I get into the elements c i j of the matrix c as a function of the elements a i j and b i j here, I would also like to state here that are these two same? That means if I reverse the order n, I will not multiply b. I will not be able to do that in general if the sizes are given like that like the way they are given. I can reverse these two provided p is equal to m because what will happen here? That means the number of columns in b must be the same as the number of rows in a. So let us say for the time being m is equal to p. So that means I am talking of n cross b a pre cross m here let us say or let us say I talk of a situation where the sizes of both of them are same. So let us say I am talking about and the same thing holds good here. Now both of them are square matrices of same size. It is not necessary for both to be same to get a matrix multiplication done. We talked about this that this value must be the same as this value. What this indicates? Number of columns in a must be the same as the number of rows in b. That must be the same. Then I should be able to multiply. I have changed all these numbers to represent that a, b are square matrices. Both are of same size because I want to reverse this. And in general this value which you get now you will get m cross m here but this is not equal to this in general. In general this is not true. Under special cases this is possible but we will not discuss that now. And this operation of matrix multiplication is extreme importance and use in many branches of science and engineering not only the field of pattern recognition. We need to multiply lot of matrices but only thing you know it is a very common operation. That is what I mean and the property that the number of columns equal to the number of rows is very, very important. What are the elements of the matrix c? What are the elements of this matrix c? As I said before that let us go back and restore this the way we talked about. m cross n is getting multiplied with a value n cross p. Get these things done. So let us take this and this can remain it is fine. So c i j that means there are m into p multiplied by elements in c. So one such element c i j can be represented as a function of some of the elements of a and b which basically means I am talking of the ith row of a and the jth column of b and it can be represented as a summation like this. Are you able to see what I have written? I have written this as a function of the elements of a and b and it is basically a sum of products. Look at how the k is varying. When k is varying it is the second subscript basically means that I am talking about the elements in a particular ith row. All the elements of the ith row can be obtained by varying k and the jth column here can be obtained. So what you are talking basically is if I can write c as a matrix equals a multiplied by b then what I can tell you is that for an arbitrary element c i j here, arbitrary element c i j here. I am talking about the ith row let us say this is i and a jth column of b and I did say sometime back earlier that the number of columns in a equal to the number of rows in b. So that means if you are doing an element wise multiplication correspondingly this is what this function is talking about. This function says that I take the first element of an ith row of a first element of jth column of b multiply them get the first term in this summation second term second term get the second and so on. How many terms? n number of terms n number of terms because that is what it is the nth column versus the nth row of the n that is what you will get at the matrix multiplication. Carrying on with the discussion the next operation which we can do is called the transpose of a matrix. Transpose of a matrix A again if I write as a i j m cross n transpose of a matrix A if A is a matrix then the symbol typically used to represent the transpose of a matrix is this or some books might also use a notation like this a very short notation basically a j i and its size will be n cross n its size will be m cross n. Basically the rows have become the columns and the columns have become the rows of a matrix T trace of a matrix A trace of a matrix A the trace of a matrix A is defined as its equal to the sum of its diagonal elements assuming that there are it is a square matrix assuming it is a square matrix of size n cross n then this is the trace of a matrix. After learning a few operations on matrices we will try now to see we will see a few properties of matrix operations. So, there are lot to be listed here so I am just writing a few and we will discuss them. The notations themselves are self sufficient here most of them are concerning addition subtraction of matrices or matrix multiplication in certain cases. The two things which you must remember in mind keep in mind here is that the small letters here represents a scalar number typically a real number but it can be a fraction also and the capital letter here represents a matrix. So, a and b are scalars here a is a scalar here a is a scalar b and c are matrices. So, this is the same as this or it can be even written as this when I write a notation like this plus and minus that means you are either adding a subtracting does not matter the same results holds good here as well. So, these are some of the laws of the operations which are possible arithmetic operations on matrices and these properties hold good continuing on that I will write just one more here inverse these are again elementary operations. So, these are elementary operations on matrices again most still to with subtraction multiplication we are talking about a big matrix 0. So, this is a big matrix 0. So, this is a big matrix of 0s matrices of 0s here this can be thought of as a 0 matrix or a scalar matrix both is possible two matrices both are same here a and b that is not that a and b are different matrices. So, both matrices must be same then only you should be able to write here the same holds to do with power and this is very simple for you to prove you can take actually a small matrix a of any size which could be even non square and transpose it twice basically each transpose operation involves interchanging of the rows in the columns. So, when you do it twice interchanging of the rows in the column the rows comes back to rows and the columns back to column. So, you will get back the matrix a here this c is a scalar the last one is the most important part which is a transpose over a product transpose over a product is equal to the product of the respective transpose in the reverse order. I repeat again transpose of a product of two matrices will results it is equal to the multiplication of two matrices their transpose is in the reverse order in the sense that whatever the order remember in general I said a multiplied by b will not be equal to b multiplied by a, but when you put a transpose here you have to interchange the order of the multiplication. Then one of the most important operations of matrices is the inverse of a matrix. So, given a matrix a again we are talking in general of a square matrix although the operations possible for inversion of a matrix which is also non square, but we will take it up later on. So, we will talk about the inverse of a square matrix a is given by the symbol usually like this a to the power minus one indicating that it is an inverse of a matrix. The elements of this matrix a can be represented as the elements of the matrix a itself, but we will go or think of such an expression right now. So, if a matrix b is the inverse of a matrix a that means you are trying to inverse a matrix a and you get a matrix b. Then these two matrices a and b they follow this property the product of a matrix and its inverse in any order b can be thought of as an inverse of matrix a or a also can be thought of an inverse of a matrix b and they are multiplied in any order. It results in an identity matrix i identity matrix is a diagonal matrix which the elements are all equal to one of diagonal elements are simply equal to 0. This operation of trying to invert a matrix there are many different algorithms talked about in the field of mathematics and numerical analysis. I will leave it to you for self study to find out methods based on numerical computation which can actually invert a matrix. I am not getting into that what I am saying is first of all this matrix must be square and then the matrix a must satisfy another property for its inverse matrix b to exist. So, we will say here if there is a matrix a then this matrix b is the inverse of matrix a and it is possible to get this value of the elements of the matrix b or construct b only if the following property holds good the determinant of the matrix a is not equal to 0. The determinant of the matrix a must not be equal to 0 and in such a case it is called a is non singular only when a is non singular you can obtain its inverse. If a is singular then this is not good a determinant will be equal to 0 and it is possible only in the case of a square matrix. So, that is the idea. So, we will now look at some properties which are to do with the inverse of a matrix. We will look at now into some properties of matrix inverse or properties related to the invert of a matrix. If the product of two matrices a and b is invertible and you know the condition of invertibility that means when you are able to invert that is the case when the matrix is non singular that means the product of this matrix a and b is non singular determinant is not equal to 0. Then I can write this as its inverse is very simple. This is the same thing which we had sometime back with respect to transpose of a matrix transpose of the product. If you put a transpose here the same thing holds good a dot b transpose will be transpose multiplied by a transpose the same thing holds good with the inverse. And of course like I talked about the sequence of two transpose we can have invert the matrix a followed by its inverse again you will get back the matrix a. So, for any integer n if a to the power n is invertible it basically means you are taking a product of a with itself n times then the corresponding inverse invert of this matrix a to the power n can also be written as this symbol a to the power minus n or you take the inverse of the a and then to the power n raise to the power n is multiply n times. The last of this list of properties is that if you take a transpose of a matrix and take its inverse taken transpose of a matrix and take its inverse this is equal to the inverse and also the transpose take the inverse of the matrix a followed by its transpose. There are many algorithms to take the inverse of a matrix, but we will take a simple example to understand the process of inversion. Let us say a very simple two cross two matrix a as elements a b c d then the inverse of a matrix a can be written as. So, if this is the matrix then this is the inverse it is very interesting to note that what is this term which I have put in the denominator at the beginning very easy to note that this is the determinant of a this is the determinant of a. And if you look at the rest of the terms here they are basically called we have not defined another property with respect to matrices this elements which I have put here triangular matrices. If you take a diagonal matrix d and I hope you remember the notation of this big zeroes which indicate that the off diagonal terms are zero and the diagonal terms I have used this one subscript because they are basically d 1 1 d 2 2 n up to d n n the matrix of size n cross n. So, for the sake of simplicity the diagonal elements I have just used one subscript to indicate that they are d i i then the corresponding matrix d inverse is given by that means what you have to do is very simply take the diagonal elements and put the reciprocal of that. Triangular matrices there are two special types of triangular matrices where you can have just the elements on the top upper right or you can have them on the upper right or lower left they are not the same l 2 1 l 2 2 up to l 2 n. So, you can actually replace this by a big zero as we have done for the diagonal element or the same thing is applicable here that I have the diagonal terms and anything to the top right is a big zero symmetric matrices a is a symmetric matrix if a i j equals a j i that means the element at the i th row j th column is the same as the j th row i th column. If you want an example it does not matter what is on the diagonal look at the symmetric part of the matrix this element should be same as this correspondingly here and here. So, this is an example of a symmetric matrix say a s if you have an arbitrary matrix a the sum of properties for an arbitrary matrix a both a a transpose and a transpose a that means you can think of these two matrices they are symmetric if a is invertible and symmetric then a inverse is also symmetric. However, if a is only I should write here a is only invertible if a is only invertible then a a transpose and sorry a transpose a if a is invertible then the product taken any order of these two matrices are both invertible I think this topic.