 So we are waiting for people to come back from the breakout rooms and then we can start with the next tutorial. OK, great. So I think that everybody is back to the main room. So it's my pleasure to introduce Zach Miller, who is a PhD student in the College and Devolution at the University of Chicago. So it's very early for him. It's like about 7 AM. So I want to thank him for being with us so early. And he will give a tutorial on linear algebra. So please, Zach, if you want to share the presentation and the mute, great. Great, you want to see that? Thanks. OK, well, thank you, Yacuba, for the introduction and the invitation. So yeah, so the aim of this hour-long tutorial is to give a very brief survey on linear algebra. It's a field that has a lot of sort of terminology and kind of foundational baggage. And so the idea here is that we're not going to be able to go through too many proofs today. We're not going to be able to go through too many computations. But hopefully, to kind of go through the important concepts to be able to use some of these ideas in the biological context. I'm going to sort of start with the real foundations. And we're going to sort of work our way up to some more kind of complex calculations and ideas. So the beginning might be review for many people, but bear with me. And hopefully, we'll get to interesting stuff for everyone. And linear algebra, I think I'll make my little pitch, is definitely the kind of material. It shows up everywhere, and it shows up in so many different sort of guises that it's really worth seeing it repeatedly, thinking about it in different ways. So hopefully, maybe this will be an opportunity to also get a sort of new perspective if it's content you've seen before. So any sort of introduction to linear algebra starts with the idea of a vector. So for our purposes, a vector is an ordered list of numbers. So I've shown a few here. And just a note on notation, often vectors are denoted either with lowercase bold face letters or sometimes with lowercase letters with a little arrow on top. I'm going to stick to bold face letters, but I'm showing both here just because you might see it. You might see them written in different ways. So when you think about vectors as a sort of ordered list of numbers, our vectors will have real or complex numbers as their components. So these sort of constituent numbers, the entries in the vectors are what we call components of the vector. And for the most part, we're talking about vectors with real components, but sometimes complex components. And we can also think about vectors graphically. So this is the first kind of, already we're seeing that there's two ways to look at them. We can think about them sort of as these ordered lists, or we can think about them as a kind of arrow in space with a direction and a magnitude. So I'll go back one second. So this u here corresponds to this vector here 3, 2. You can see that it's 3 in the first coordinate and 2 units up in the second coordinate. Similarly, the second vector v is this negative 1, 1. OK, so this is the kind of geometric picture that we've probably all seen before. And then we can do operations with our vectors. So vectors of the same size can be added component-wise. So for example, if we want to take that vector u and v and take their sum, we add them just add the components. So 3 plus negative 1 becomes 2. 2 plus 1 becomes 3. So the sum of our u plus v becomes a new vector 2, 3. That can also sort of live in the same geometric space. Graphically, we think about this adding the vectors tip to tail, as people like to say. So we take the first vector u and then at its tip, we place the tail of the second vector and we get the sum u v. This vector addition, because it basically comes from the addition of the scalar components, the real or complex values in here, the addition follows all the normal rules of the addition of those numbers. So namely, it's communicative, distributive, associative. So we can reverse the order of the sum here when we get the same vector out. So vectors can also be multiplied by scalars. So we can take just a regular old real number here and multiply it by the vector u. And what that does is it just multiplies each component of the vector u by that scalar. So here, our u goes from being 3, 2 to 3 halves, 1. And in the picture here, you can see that we're basically scaling its length. So when we multiply by a scalar, we're stretching the vector, but not changing its direction in space. The only exception when I say not changing the direction is we can flip the vector by multiplying by a negative number. So now we're just taking the vector and it's remaining in the same sort of line in space, but the orientation is pointing the other way. So this is just the case of negative 1 half times u. So combining these two operations, the component-wise addition of vectors of the same size and multiplication by scalars gives us a slightly more general notion of a linear combination. And this is a really foundational concept in linear algebra. So over here on the right, I'm showing that if we take any two vectors u and v from some set of vectors that we're going to call a vector space and some scalars, c1 and c2, then we can write very generally the linear combination c1 times u plus c2 times v. And you can see here the sort of component-wise definition of this. So the outcome will always be a new vector w. That's also going to be in our vector space, right? So we can say that the vector space is closed under linear combination. So it's the set of all vectors that we can get by making linear combinations of the vectors we start with. OK, so let's talk a little bit more about linear combinations. So again, linear combinations generate new vectors from old. Given a collection of vectors, so you give me some vectors, the set of all other vectors that I can express as linear combinations of those is called the span. So this is another important sort of idea that we're going to rely on a lot throughout this tutorial. So just to kind of get a sense, I mean, to get a feel for using this term, here we say that w is in the span of unv, right? So if I start with these two vectors unv, I can produce a new vector w by taking a sort of a linear combination of unv, a sort of weighted sum of unv. And what that sort of looks like is I, again, I start with my unv. Now I multiply them both by scalars. So I stretch v a little bit. I shrink u a little bit. And then I take these new scaled vectors and I sum them and I can produce w. So because w is expressible as a linear combination of unv, it's said to be in the span of unv. So t is another example of a vector in the span of unv, just to make clear that we can use negative numbers to make these vectors point in the other direction. So this t is, again, just produced by scaling and then summing unv. And we can also sort of strangely use span as a verb, as well as a noun. So we often sometimes say things like unv span the entire plane. So I'll kind of skip to the punchline here and say that I show you w and t, but really we can produce any vector that we want that can be drawn in this plane by taking combinations of unv. And so this leads us to another really important idea, which is this idea of linear independence. So if we take a set of vectors s and none of the vectors in that set can be written as a linear combination of the others, then we say that s is linearly independent. If that's not the case, then s is linearly dependent. So if there is a vector, or in that case there might be many vectors in s that can be written as a linear combination of other vectors in s, then the set is linearly dependent. So a few more examples. So in this picture, if we consider the set just of unv, this set of those two vectors is linearly independent. And that's because if we take just v on its own, the only way to sort of take linear combinations of v by itself are just to scale it. So all we could do is sort of shrink or stretch this v. We can never produce u by doing that and vice versa. The set u and w is linearly independent for the same reason. But the set uvw is linearly dependent now. And that's because as we just saw, we can produce w as a linear combination of v and u. So we're going to see this idea again of linear independence again and again. And linearly, linear independence leads right away to another concept that's the concept of a basis. So a linearly independent set of vectors that spans a vector space. So we specify some vector space, for example, of a plane here. And if we have a set of vectors that spans it so they can generate through their linear combinations, they can generate any other vector in the plane. And they're linearly independent. And they form a basis for that space. And the idea of a basis is sort of an idea of a kind of minimally sufficient set of vectors. So the spanning part tells us that these vectors are enough to generate the entire space under consideration. The linear independence part tells us that none of them are redundant. So if we add more vectors and this set becomes linearly dependent, that means basically we have more vectors than we need to span that space. So as some examples here, the set uv is a basis for the plane for r2. So I use this kind of fancy r to denote the real numbers and the two to say, basically, the plane, the set of pairs of real numbers. Uw is also a basis for r2. uvw is not a basis for r2. And the reason is because uvw, well, it does span r2. So we can use that set to generate any vectors in r2. It's a linearly dependent set, right? So it's not minimal. It has redundant vectors in it. And u is clearly not a basis for r2 because, of course, we can only use you to generate vectors that lie on this line in the direction of u. So we can't generate any vector in the plane. So the idea of a basis is actually one that is, I think, in many ways, familiar. So even if you haven't really looked at linear algebra in this formal way before, anytime we work in a coordinate system, we are usually thinking about a basis called the standard basis. So even the axes that we often draw for a 2D or a 3D space, we can think of as basis vectors. So this is the standard basis. It's often convenient to work in reference to the standard basis. So here we often use ei to denote the i-th standard basis. So e1 is the vector with just a 1 in the first component, 0's everywhere else, e2 is the vector with a 1 in the second component, 0's everywhere else, and so on for whatever kind of space that we might want to look at. And so in this picture, these are just vectors of length 1 in the direction of the usual coordinate axes. And then when we write, for example, whenever we do sort of Euclidean plotting, we think about ordered pairs of numbers like AB. And we can think about these just as linear combinations of the standard basis vectors. So a scalar a times e1 plus a scalar b times the vector e2. And then that just looks like in component form, it looks kind of like this, a0 plus 0b. So all this is to say is that these concepts are ones that we actually kind of work with all the time. OK, so so so far we've relied a lot on the 2D picture. The real beauty of linear algebra in a lot of ways. And one of the reasons, I mean, the primary reason it was developed is to kind of deal with a common notation and a common set of tools to deal with very high dimensional spaces. Spaces of sort of arbitrary dimensions. So in physics, we might think about the space of sort of physical coordinates like 3D space or something. In biology, we might think about our space, our vectors as being maybe the abundance of different species in a community. And so if we go to a diverse community, we might have hundreds or thousands of species. And so we might be dealing with vectors that represent the state of the community that have hundreds or thousands of components. We might think about the concentration of molecules in a cell. Whatever the kind of nice thing about linear algebra is the tools we're going to develop can accommodate anything from one to two to three to 10 to 100 to 1,000 sort of components at a time. So while it's useful to draw the 2D pictures, and I'm really not much good at drawing anything beyond the 2D picture, it's useful to right away start thinking about dimensions beyond sort of other than the two-dimensional picture. So let's take a moment and think about now the three-dimensional picture. So we have the vectors S and T. Do these form a basis for the space R3, so for the three-dimensional space? It turns out no. So as a kind of counter example here is a vector, you can try as hard as you'd like. You can never write this vector as a linear combination of the other two here. Now I sort of just pulled that vector out of there, and we'll see as we go along how we address this question more systematically. But this is to say that in the space R3, two basis vectors is not going to cut it to span that space. But we shouldn't say that S and T are not a basis generally. So they're not a basis for R3, but they are a basis for some space, and they're a basis for what we can call a subspace of R3. So sort of geometrically, they form a basis for a plane, a two-dimensional space that we can think of as kind of embedded in R3 that goes through the origin of the three-dimensional space, but is a kind of lower-dimensional space within it. So that's a perfectly good vector space in its own right. It's closed under your linear combinations. So if I take linear combinations of S and T, I'm going to produce new vectors that are in that subspace. But these two vectors don't span the entire R3 space. And so I may have already slipped and used this word because it's really hard to avoid it. But this leads us to the important concept of dimension. So we're used to thinking about dimension kind of heuristically. And linear algebra gives us a very kind of nice and precise way to think about the dimension. So if we have a vector space, it's defined by a basis. So there might be multiple, there are in general many choices of a basis that define some vector space. But given a basis, we've specified a vector space. And the size of that basis gives the dimension of the vector space. And the sort of way we've defined a basis, so this fact that the basis has to both span the space and be linearly independent, these two kind of conflicting requirements. So one is this requirement that the set of vectors are sufficient to span the space. And the other is that they're kind of minimal. These two requirements mean that the dimension is a unique number. So any basis for the same vector space will have the same number of elements and so the same dimension. So I've shown you that we can take two vectors and combine them to produce a new vector as a linear combination. But how do we kind of go backwards? So if I have some arbitrary vector and I have some basis, how do I represent that vector as a linear combination of the basis vectors? So how do I kind of write it in the space defined by that basis? So this problem, while it seems kind of like a sort of intellectual exercise, it's equivalent to solving a system of linear equations, which is really one of the most important tasks that we have in science. So systems of linear equations show up just about everywhere. They are historically extremely important and really motivated the development of linear algebra. And so let's kind of see quickly that these two exercises are really the same. So here the graphical picture is that we have some V and U. We have some W. And I want to find what are these two scalar coefficients C1 and C2 such that I can write W as a linear combination of U and V. Now let's look at something that at first glance is totally different. So here's a system of linear equations. So this kind of problem arises all over the place. And we have here three unknowns, x, y, and z. And then we have coefficients for each of those unknowns. And then on the right hand side of the equation we have these just like scalars. So what we can sort of see if we write down the system in this kind of nicely formatted way is that these coefficients 2, 4, and 7 all get multiplied by x. These coefficients 1, 4, and 2 all get or negative 2 all get multiplied by y. These coefficients negative 1, 2, and negative 3 all get multiplied by z. What that suggests is that this ratio 2, 4, and 7 are fixed. So by changing values of x, we can only change these values in lock step with each other. And we can see that we can actually rewrite this equation with this 2, 4, 7 now as a vector and this x as a scalar coefficient out front. So we can really recognize the system of linear equations as exactly the sort of linear combination problem we were talking about. So now here we have some vectors 2, 4, 7, 1, 4, negative 2, negative 1, 2, and negative 3. And we want to find a linear combination of them such that we produce this kind of target vector or this arbitrary vector negative 1, 4, 3. And so we want to know what are these coefficients x, y, and z. And to do that, we need to introduce the idea of a matrix and develop a little bit of machinery here. But first maybe, if anybody has any questions, at this point before we keep going, feel free to raise your hand. Yes, so I think it's a good moment to ask a question. If anyone wants to ask one either in the chart or raise in the end, I think we can take a couple of minutes to summarize any question. No, I think it's a sign that everything is extremely clear. All right, great. Yeah, then we'll keep moving. OK, so yeah, so we're going to pause on that question for just a second to introduce this kind of new machinery. So now we're looking at matrices. So a matrix is just a rectangular array of numbers. So before we were looking at vectors, which were sort of just a list of numbers, now we're just looking at a structured array of numbers. So the notation we're going to use consistently is that a matrix is denoted just by a capital letter, so like A, B, or C. You can see here matrices can kind of have different sizes. So that's kind of the first important thing to talk about, that the size of a matrix is given by the number of its rows. So each of these kind of things across here, we call a row, and the number of columns. So each sort of like of these stacks of numbers here is just a column. And the kind of usual format that we give for the size of a matrix is like rows by columns. So this matrix down here, A, for example, is 2 by 2. This one in the middle is 3 by 3. And this one on the right is 3 by 2. OK, so the rows and columns don't necessarily need to agree with each other. And when we talk about matrices, we usually talk about elements. Why we use elements for matrices and components for vectors? Don't ask me. But the elements of a matrix are these numbers that make it up, right? So here we can talk in general about an n by m matrix, so a matrix with n rows and m columns. And its entries are denoted by a little a i, j. So i telling us which row, so a number between 1 and n, and j telling us which column, a number between 1 and m. So for example, this 1 here is in the first row of b and the second column. So we would call that b1, 2. This 10.5 here is in the second row and second column of c, so we might call that c22. OK, so as with vectors, we can operate on matrices. We can add matrices at the same side element-wise. So this is much like the addition of vectors, right? We add these two matrices. We take 2 plus negative 1. We get 1 in the 1, 1 position. In the 1, 2 position, we add 3 and negative 3. We get 0 and so on. Matrices can also be multiplied by scalars. So as before, we multiply by a scalar, and that means that every element of the matrix is multiplied by that scalar. So this negative 2 just goes inside the matrix to each entry, and you can follow the multiplication there. So again, I think lectures are going to be available later. So if I fly through any of these examples, you can always go back and take another look. So again, we have the notion of a linear combination here. So linear combinations and matrices, as with vectors, produce new matrices. And what that actually tells us is that really matrices form their own vector space. So we can take any two matrices at the same side, produce linear combinations. We get a new matrix of the same size out. OK, matrices come with two new and slightly more interesting operations, transposition and multiplication. So the transposed operation, which we write with this little t here, sometimes confusingly because it looks like a power, but this t is really a special operation that takes an n by m matrix and just basically flips it on its axis. So the outcome of this transposition is an m by n matrix where now the ij entry of our new resulting matrix was the ji entry of the a matrix. OK, so here's an example. If we take this matrix A and we take its transpose, now we just sort of flip it. And what was the sort of 2-1 entry, the A 2-1, becomes now the 1-2 entry. So that's where this 2.2 goes from here to there. So that's fairly straightforward. Matrix multiplication is worth talking about for a minute longer. So we can multiply matrices, but the sort of multiplication operation is somewhat different than how we extend addition. So whereas addition of matrices works element-wise, multiplication of matrices is kind of a new thing altogether. So if we take two matrices, A, which is n by m, and B, which is p by q, we can multiply these two matrices only if the number of columns of A, so this m, is equal to the number of rows of B, this p here. And we'll see why in a second. So in this definition for matrix multiplication, the product of these two, AB, each of its entries, so the ijth entry, for example, down here, is defined by taking the sum of many products over the entries of A and B. So we take, again, to get the ijth element of the AB, we take the ai1 plus b1j plus ai2 plus b2j and so on. So this is kind of in compact summation form. We sum basically over all of the columns of A and all of the rows of B. So that's this k, this index of summation for whatever i and j we're interested in. OK, so here's that in action very, very briefly. So to get the product of these two matrices, let's first think about the first element of the product, right? So the 1, 1 element, which here is a 4, we get that by multiplying the basically the first row of the first matrix by the first column of the second matrix. So we have this 2 times negative 1 plus 3 times 2. That should give us negative 2 plus 6, which is 4. We can try to do this for any of the elements of the resulting matrix. So I'll just go through one more. So this 5 down here, which is in the second row in the first column, we get by multiplying the second row of the first factor matrix and the first column of the second factor matrix. So we get a negative 1 times negative 1 plus a 2 times 2 and that gives us 5. So it's worth spending a little time practicing this if it's not something you're familiar with. I'll leave the formula up here for a moment longer. OK, so matrix multiplication, again, works a little bit differently than normal multiplication. And one thing right away that's a bit different is now the multiplicative identity. So the element such that if we take b times the multiplicative identity, we get b back. So in normal multiplication, that's just played by the 1, the number 1. So if we take 10 times 1, we get 1 back. In matrix multiplication, that identity element is what's called the identity matrix. And it's a special matrix that looks like this. So it's an n by n matrix that has 1s on the diagonal and 0 everywhere else. And so it's worth kind of trying it out for yourself. But if you multiply any matrix by this i n, you'll get just the original matrix back. It's worth noting that this identity matrix is a very special example of a kind of more general type of matrix called a diagonal matrix. It's a matrix that only has non-zero elements on its diagonal, so for the elements like a i i, a 1 1, a 2 2, only when the i and the j equal each other. And we'll see some more diagonal matrices soon. And another sort of important fact about matrix multiplication is that it's associative. So if we have a times b times c, we can multiply a and b first, or we can multiply b and c first. It doesn't matter. And it's distributive. So if we have a times the sum of two matrices, we can distribute that multiplication across the sum. But it is generally not commutative. So a times b is not equal to b times a. And this is maybe you might have already sort of seen that this kind of has to be true because of the way we define matrix multiplication. So it depends on the sort of interior dimensions, this m and p being the same. So if we take now b times a, then we have to worry about whether q and n are the same, these sizes. So that's just to say that a times b will not generally be the same as b times a, even when the sizes work out, actually. OK, so matrix multiplication. We can actually now go back to our vectors for one moment. And matrix multiplication kind of gives us some interesting concepts when we look at vectors. So first, if we take the transpose operation, we turn our usual vectors. So I've been writing vectors as columns. And that's just conventionally how we think of them as sort of vectors are almost like matrices that have n rows and one column. But if we take the transpose of a column vector like that, we get our row vector, which is now kind of like a 1 by n matrix. And if we take two vectors and we apply the transpose to one of them, now we can actually multiply them using matrix multiplications. So for example, if we take a vector v and a vector u that have the same number of components and we transpose the v, then we have a 1 by n vector times an n by 1 vector, OK? Or essentially a 1 by n matrix times an n by 1 matrix. And so we can just multiply them using the usual matrix multiplication rule. This is often called the dot product, or it's an example of an inner product. And you can see down here, we just end up with v1 times u1 plus v2 times u2 and so on. And this is a really sort of important operation that will come up all the time. One very interesting connection here worth seeing is that the inner product or dot product of a vector with itself, so v transpose v, is equal to the Euclidean length or the norm of the vector squared. So if we take v transpose v, we have v1 squared plus v2 squared plus v3 squared. That should be a squared, sorry, here. And then if we were to take the square root of this, we would get the normal Euclidean length of the vector. And if we take the inner product of two different vectors or even the same, the product is actually going to be a function of their lengths and their angles. So we can explicitly write it as a function of their lengths and angle. So v transpose u is going to be equal to u transpose v. So this operation is symmetric in this case. And this will be equal to basically the length of u times the length of v. So the square root of the inner product u transpose u, the square root of inner product v transpose v times the cosine of their angle. So in this picture here, this inner product that comes out is just a function of basically the length of u, the length of v, and then this angle here formed by the two vectors in space, which holds even in the higher dimensional picture. And importantly, if v transpose u is equal to 0, so if this inner product is equal to 0, that can only happen for non-zero length vectors when this cosine term is equal to 0. And so what that means is that the angle between these two vectors is basically a 90 degree angle, so that it's a right angle. So this idea of orthogonality generalizes the kind of idea of being perpendicular or at right angles that you've probably seen in a geometry class. OK, so that's just a little aside. Now back to matrix multiplication, it's often useful to view matrices as sort of bundles of column vectors. And now let's just think about multiplying a matrix by a column vector. So we have a matrix sort of an arbitrary size, and we're going to multiply it by a column vector, an m by 1 matrix essentially. If we kind of think through the matrix multiplication, what we see here is that this v1 is going to multiply all of the elements in the first column of A. The v2 is going to multiply all the elements in the second column. The v3 is going to multiply all the elements in the third column. And so we can actually sort of rewrite this matrix multiplication as v1 times this sort of vector now formed from the first column of A, v2 times this column vector formed from the second column of A, and so on. So really taking this product A times v is giving us a linear combination of the columns of A. OK, so this is bringing it all back to sort of the question we had maybe 20 minutes ago now about sort of representing a vector in a basis or equivalently solving a general linear system. So these two things are both equivalent to the matrix equation, A, v equals b. So we have a matrix A, the columns of which are our basis vectors. We have a v, which is a vector of unknown coefficients that we'd like to figure out. And we have a b, which is kind of an arbitrary target vector. So if this were a sort of normal equation in scalars, if this were just like 4 times v equals 2, the way we would solve it is we would multiply each side by 1 over 4, by 1 over whatever this coefficient is. So because we're dealing with matrices, we can't quite do that. But we sort of want to find a new matrix, an inverse matrix for A, such that when we multiply this inverse A times A, we just get v on the left. So it's very much sort of in analogy to what we would do solving just a normal kind of linear equation. So here what I'm writing is that this A inverse times A gives us the identity matrix. So if we could find such a matrix, then solving this system of equations would just basically turn into a matrix multiplication. We could just take that matrix A inverse, multiply it times b using the kind of formula we just saw, and we would get v equals A inverse b. So that's a nice idea. And then the kind of question is, when can we find such an inverse matrix, and how do we find it? Well, if A is an n times m matrix, there will be a unique solution of this form whenever the columns of A comprise a basis for Rm. And that comes from the fact that we were seeing earlier, which is that we need the columns of A to sort of span the whole space in order for there to be guaranteed that there is a linear combination. And we need them to be a basis. We need them to be linearly independent. Otherwise, there's sort of degeneracy here. So to find a unique solution, we need those columns to comprise a basis for Rm. And then the kind of geometric picture that we saw sort of guarantees to us that there'll be a unique solution here. But we'll look a little more at that in a minute. So this is just kind of a restatement of what I just said. It's a really important kind of fact to keep in mind. In practice, we can write down formulas for the inverse of A. They're very cumbersome, and they're not really even worth looking at here. In practice, we compute inverses and generally solve matrix equations using numerical algorithms. So maybe you've used these algorithms in R many times you've solved for a matrix. And it's sort of a kind of thing best left to numerical algorithms. But it is worth thinking a lot about when we expect there to be an A inverse and sort of being able to work with the inverse of A symbolically. OK, so one other way of looking at matrices is as representing linear transformations from one vector space to another. So what's a linear transformation? A linear transformation is a kind of operation that maps a vector space into a different vector space. It encompasses a lot of the kind of mappings that you might dream up, like rotating the vectors around the origin, scaling them, that is stretching them, reflecting them, and the composition of these operations. The sort of formal definition for some function to be a linear transformation is that whatever this function is, it has to distribute over sums. And when we multiply, when we take the function of some scalar times a vector, the scalar has to come out of the transformation. So all of these operations that I mentioned up here satisfy this property. And it turns out that if A is an n by m matrix, then the sort of multiplication A times x actually describes a linear transformation from the space rn to the space rm. So any linear transformation can be expressed this way, and any matrix can be thought of as corresponding to a linear transformation. So here's just a little picture. If we take some vectors and we map them now to a new space by multiplying by A, so we take the vectors that I've shown here, we multiply them by this matrix I show above. We get some new vectors. So this 1, 1 vector here kind of becomes, gets sort of rotated and stretched. Again, this negative 1, negative 1 gets rotated and stretched. So we can see there's this composition of these different operations. So I just draw four vectors. Of course, there are infinitely many vectors in this space, and they all get sort of rotated and stretched and scaled by this operation. So this gives us kind of another meaning for the matrix inverse. If we take one matrix times another, what we're doing is we're basically just applying sequentially two different linear transformations. So we're first rotating and stretching one way, and then we're rotating and stretching some other way. And in particular, if we multiply by the matrix inverse, that matrix inverse is a new transformation that undoes the old transformation. So if we first took our original vectors, we multiply them by A to get this kind of new, to be mapped into this new space, now if we multiply by A inverse, we actually just map back to the original vectors. OK, so here's the inverse matrix. So again, this is kind of two rounds of matrix multiplication that takes us from our original space to a new space, and then sort of back to the old vectors we started with. A linear transformation has, so we start with some vector space. We apply a linear transformation, and we're going to generate some new vectors. The set of all possible vectors, all the vectors that we can generate by taking this linear transformation is called the range of the transformation. So in this kind of picture, we start in this Rn space, and we're mapping to this Rm space. And there's going to be some subset of vectors, potentially the whole Rm, but maybe just a subset that our initial vector space maps into. And this is called the range of the transformation. And the dimension of this range, so this range may be, again, the full space Rm, or it may be a subspace, and its dimension is called the rank of the matrix A or of the associated transformation. And equivalently, that's going to be the dimension of the span of the columns of A. So if we take a look at the columns of A, the number of linearly independent columns of A, essentially, a closely, very closely related concept is that if we take the set of all vectors in our original space such that A times x is equal to 0, this is called the null space of A. So these are all the vectors that get mapped to 0, basically, by this linear transformation. And the dimension of this space, so now this is a space that lives in our domain in Rn, the dimension of this space is the nullity of A. And a very important result in linear algebra that unfortunately we don't have the time to really prove or to do justice to is that the rank of A plus the nullity of A is equal to M, to the dimension of the space that we're mapping to. And closely related to both of these concepts is a characterization of when A it's possible to write down an inverse for a matrix A. So A is invertible, that is, it has an inverse. If and only if the linear transformation associated with it is one to one. So if every vector in our initial space has a unique vector in the range that it maps to. And basically, because of this rank plus nullity statement above, we can equivalently say that a matrix A is invertible if the rank of A is equal to M, or if the nullity of A is equal to 0. And one kind of immediate thing we should see here is that for any of this to really be possible, we need the N and M to be the same. So if we're mapping from a vector space of one size to a vector space of a different size, we're never going to have this one to one mapping. So what that tells us is that only square matrices, only matrices that are N by N are going to be invertible. OK, so that's several kind of characterizations of when we expect them to be an inverse. And again, we're not really getting into the computation, but that's OK. But none of these are really practical criteria that are easy to check. They're kind of nice ideas, but how do we actually check if a matrix is invertible, if it's what its rank is or what its nullity is? Well, one way to do that is to use a kind of fundamental summary statistic for matrices called the determinant. The determinant of A is a kind of nice number that shows up everywhere. It kind of describes in a kind of hand-wavy way how volumes are scaled under the linear transformation associated with matrix A. But it has one really important property that relates to our question about inverting matrices. And so here I'll just show you a little example. So there's an analytical formula for the inverse of a 2 by 2 matrix that's actually not too bad to write down. I wouldn't recommend trying to write it down for any bigger matrices. But a 2 by 2 we can do. And it's just on the right here. And you'll see that in this formula, one over the determinant of A appears. OK, so this number, the determinant of A, that is basically a polynomial in the entries of A. In general, it's a kind of complicated polynomial. So again, I recommend usually we just use sort of numerical algorithms to calculate this number. But in any case, it appears in these formulas. And so here, this might lead us to suspect that the inverse of A is not defined when the determinant of A is 0. Because then this becomes 1 over 0. And this is just kind of a hunch that we might develop. But it turns out to be a generally true fact. So a matrix A is invertible if and only if the determinant of A is not equal to 0. And again, there's a kind of nice formula for the determinant for 2 by 2. But quickly, the formula becomes quite complicated as we go to higher dimensions. OK, and so now that we kind of have the determinant in hand, we get to our final topic that we're going to have to treat a little quickly in my apologies, which is really one of the most important in linear algebra, which is the idea of eigenvalues and eigenvectors, so the eigenvalue problem. So the kind of question we're trying to answer in the eigenvalue problem is which non-zero vectors have their orientation unchanged under linear transformation? So I take some vectors on the right here, and I apply some linear transformation. They get mapped into new vectors. A few of them, very special ones, don't change their direction. They are only scaled. So this vector that was originally pointing up gets kind of rotated and stretched. But this vector that's following the 1, 1 line just got stretched and not rotated. And so the linear, I mean, the eigenvalue problem asks, which are these kind of special vectors that just get stretched? And more symbolically, we have this eigenvalue equation, which is A times x equals lambda x. And this lambda here is a scaler that we call an eigenvalue. And these x's are sort of unknown vectors. So we want to figure out what are solutions, what are x's and lambdas that make this equation be true, that make it true that the multiplication by A just returns us the original vector we had possibly scaled, kind of stretched, but not rotated or reflected or anything like that. So these unknown vectors are called eigenvectors. And the scaler here, lambda, is called the eigenvalue. And OK, so we can try to solve this problem. And what we might kind of naively do is to say, OK, well, let's get all this stuff on one side of the equation so we can take our A x, subtract lambda x. So we get this equation on the second line, A x minus lambda x equals 0. And then what we might do is we might say, OK, well, let's factor out that vector x. And to do that, because we're doing matrix multiplication and here we have to be a little bit careful, when we factor out the x from this scaler, we need this scaler like lambda to still kind of be compatible with addition with the matrix A. So we get lambda times the identity matrix, right? So this A minus lambda times the identity matrix times x. If we distribute this x, we get back the equation above. OK, so now this is really a matrix equation, sort of like the ones we looked at before. So we have a matrix A minus lambda identity times a vector x equals 0, which is a vector of 0s here. So it's just some target vector like we talked about before. But it's not quite as simple as before, because we noticed that 0, the vector x equals 0, is always kind of a trivial solution that solves this problem. OK, but we're not interested in that 0, that trivial solution. We're interested in non-zero vectors. But if this matrix A minus lambda identity is full rank, then the discussion we just had tells us that 0, because 0 is a solution, it must be the only solution, because the mapping would be 1 to 1. So that would imply that if 0 is a solution, it's the only one. And so what that tells us is if we want to find sort of non-trivial solution, non-zero x's, then we need this matrix A minus lambda identity to be singular, to be non-invertible, because we need the transformation now to be not 1 to 1. So that's actually kind of nice, because it tells us that what we need is we need the determinant of A minus lambda i to be equal to 0. That's the criterion we just mentioned for a matrix being non-invertible or singular. And this expression here, determinant of A minus lambda i equals 0, is called the characteristic equation for A. The left-hand side is a polynomial of degree n, which we call the characteristic polynomial. So again, this determinant here, there's a sort of formula for it, and it's going to be a big, big degree n polynomial and the entries of the matrix inside. But then now we can just sort of apply some results from algebra. So if we have a degree n polynomial, that equation is going to admit n solutions, n roots, if we count them with multiplicity. We count repeated roots potentially. So there are up to n distinct solutions, n distinct values of lambda. And each value of lambda comes with a kind of associated x and associated eigenvector. And we call these together an eigenpair. But one important caveat is that we have to admit complex eigenvalues and eigenvectors potentially. So even so as soon as we start having an n degree polynomial, even if it's just a quadratic or something, it might have complex roots. And we have to allow that kind of geometrically we can think about that as like there's no guarantee that a linear transformation has real eigenvectors. And one example is I mentioned that rotation is an example of a linear transformation. So rotation would be like take a vector space, rotate every vector by 10 degrees. If we did that, it's fairly obvious that there's going to be no vector that doesn't have its direction changed. So in that case, there will only be complex eigenvalues and eigenvectors. If our matrix A already is non-invertible, so matrix A on its own has a rank k less than n, then there are going to be zero eigenvalues. So basically there are going to be V's choices of V or eigenvectors such that A times V equals 0. So basically if we go back to our characteristic equation, we don't even need this lambda identity to make determinant equal to 0. So there are just choices of eigenvectors that already kind of satisfy that equation. So those are going to be eigenvectors with associated eigenvalues of 0 and the rank of the matrix will be equal to n minus the number of zero eigenvalues. So this gets us to our final kind of big laundry list of statements here that is worth spending a little bit of time thinking about. But hopefully, at least very quickly, we've discussed the ways that all of these statements are equivalent. So A being invertible, the columns of A forming a basis for our n, this system of linear equations having a unique solution for any B, the columns of A being linearly independent, the determinant of A being non-zero, the rank of A being n, the nullity of A being equal to 0, and A having no zero eigenvalues. So I think we're just about out of time here. Yes, so we have five or 10 minutes more. OK, so one last is that that's kind of, so that's the eigenvalue problem in a nutshell. And I mentioned that the eigenvalue problem has potentially n distinct eigenvalue, I mean n distinct solutions, basically n distinct pairs of eigenvalues and eigenvectors that satisfy it. And so we can imagine collecting all of those eigenvectors as columns of a matrix Q, so shown down here. And all of their associated eigenvalues in a diagonal matrix, lambda. So again, this diagonal matrix is just a matrix that has like AII or lambda 1, 1 equal to something or something non-zero, lambda 2, 2 equal to something non-zero, and all the off-diagonal elements equal to 0. So we can form these two matrices, OK? And if we form these two matrices, we have now this kind of bigger matrix equation, A times Q equals Q times lambda. That unfortunately, I don't think we have time to really kind of walk through the multiplication of these things, but this is a way to sort of simultaneously write all of the solutions to the eigenvalue problem. So that formulation that we had before, Ax equals lambda x, this is a way to sort of write all n solutions to that at once. And we can see that if this matrix Q is invertible, if it has an inverse, then we can multiply by that inverse to get down below, A is equal to Q times this matrix big lambda times Q inverse. So again, this is only possible when Q is invertible, and that's only possible when one of those many characterizations holds true. So what it really means, or one thing it means, is that we must have n linearly independent eigenvectors, OK, when that holds true, we can write A in this way. And what this tells us is that the matrix A is completely specified by its eigenvalues and eigenvectors. It can be written solely in terms of its eigenvalues and eigenvectors. So this question that we started with that looks sort of like kind of a funny question, ends up being a complete characterization of a matrix. And so if we know all of the eigenvalues and eigenvectors, we know the matrix, we know everything about it. But the eigenvalues alone actually are often highly informative about a matrix. So the set of eigenvalues is called the spectrum of A. And in many cases, for example, in biology, in looking at different models, dynamical systems, the eigenvalues alone of A contain a lot of rich information about A. So an important example that I'm sure you will see in these talks if you haven't seen before, and this is just to give you kind of a little hint of the value of thinking about these eigenvalues and eigenvectors, is that if we have a matrix difference equation, like xt equals A times xt minus 1 plus B, so basically a system where at each new time, we get the state of the system by multiplying the old state by a matrix, this kind of system converges to an equilibrium if and only if the maximum absolute value of the eigenvalues of the matrix, which is called the spectral radius, is less than 1. Very closely related to this, the matrix differential equation dx dt is equal to Ax plus B, which is again a very general, very useful model, converges to a steady state if and only if all the eigenvalues of A are negative. So these are cases where just knowing these n numbers is totally sufficient to tell us about the behavior of a system that really has n squared parameters that's governed by this whole matrix. But these eigenvalues are actually incredibly rich source of information about it. All right. So I'll stop there. Apologies. No, no, no. I don't know if there's time for any questions. Yes. I think we have like two, three minutes for questions. And then we can take a short break before the next lecture. So please, if you have any question, you know how to ask it, either post it on the chat or raise your hand. So there was a question a little bit back on the eigenvalue asking if the scaling factor that appeared in equation was the eigenvalue. Yeah. You can ask the question. Yes. Hi. In one of your slide, you showed that, yeah, the direction remains same, but only the rotation of the vector changes. And that's for the eigenvalue. You showed that. Yeah. So like in the slide that I have here. So these vectors. Lambda is the scaling factor here. Yes. Exactly. Yeah. So the lambda is the is for the eigenvalues. The lambda is the scaling factor. Yeah. Yeah. Yeah. So just for example, like in this picture here, this vector like one, one, or this vector size starts as two, two, and it becomes three, three. So the one of the eigenvalues of this matrix is three halves. It's scale this, these vectors become scaled by three halves when we multiply by a. There is another question by Augusto saying, how do we determine the eigenvalue of a rectangular system? If it is even doable. That's an excellent question. So, yeah, I actually kind of omitted that. So, right, for, for the, for this kind of interpretation that I gave where this vector has its direction unchanged under multiplication by a, that really implies that, that the matrix must be square, right. If, if this vector on the left has a different size than the one on the right, then it doesn't even really make sense to talk about the orientation being unchanged. But that said, there is an analog to the idea of an eigenvalue for a rectangular matrix. And that is, if we take now the matrix A transpose A. So this, this matrix is, we can think of as kind of like a covariance matrix of a, a matrix related to A that, that now will be square. And so being a square, it will have eigenvalues. And so we can look at the eigenvalues of this matrix, which is related to A. And these are called the singular values of A. And they, the theory of the, of the singular values of A is a little bit more complex. We don't really have time to get into it. But they, they, they kind of tell you, tell us much of the same information as the eigenvalues would. So yeah, if you're interested in learning more about that problem, the thing to look into is the singular values of A or the singular value decomposition, which in many ways is sort of analogous to the eigenvalue problem. Thank you. So is there, well, I think, well, let's ask, let's answer the last question by the Basmita. And then I think we can take a short break and stretch legs, get a cup of coffee. And before the next lecture, so the Basmita ask, what happens if we get an eigenvalue equal to zero, what will be its geometry representation. Yeah, so an eigenvalue equal to zero is basically a direction. So this matrix A is a direction where the multiplication by A just maps us to zero. So we can kind of think in this picture that if, if this eigenvalue for say this direction here where it was zero, when we multiply by A, we would, the matrix would just become, I mean the vector here would just become zero. So we would kind of just map to the origin. So these are directions where multiplication by A just kind of kills the vector that's there. And what this tells us is that X is in the null space of A. So yeah, I'm not sure if that's, hopefully that, that gives some sense of the geometry. Okay, great. So I think it is a good time to stop. Thank a lot, Zach, very much for the, for the very nice tutorial, which again, it will be available on YouTube for the next generation to come. So thanks again also for doing that very early in the morning. And of course, I think it's a good time to, so thanks again also for doing that very early in the morning. And of course, what we're going to do now is to take a four minutes break. We're going to be again randomly assigned to breakout rooms. So feel free to chat with whoever you are assigned to or to take a break, stretch your legs and we'll be back in four minutes. Thank you very much. Thank you. Zach, good morning. Hi Stefano. Hey Stefano. Good job. I caught only the last few bits. But thank you for doing that. Yeah, my favorite part. That's great. Just to mention that we are live streaming on YouTube.