 Welcome back everyone. Today we're going to continue on with our lecture series based upon the textbook Linear Algebra done openly and we're in section 4.4 today and we're going to talk about affine transformations. Like always, I will be your professor, Dr. Andrew Misaline, at Southern Utah University. So we've talked in the past about linear transformations. Those transformations preserve vector addition and scalar multiplication. Today we're going to talk about a generalization of linear transformation known as an affine transformation. And before we do that, I want to make a connection to the inner products we've been talking about in the previous three lectures with angles and geometry. And so we're going to talk about what the so-called law of cosines in the context of vector algebra, vector geometry. So whenever you have two vectors, let's say u and y here, there's always associated to these vectors a triangle, and we've seen this before. So let's take the vector u and let's take the vector y and these are vectors which we're going to place the arrows tail to tail. And so once you have these two vectors, we can actually form a triangle by taking the heads of the two vectors and connecting them. The one pointing from y to u would give us the vector u minus y. And so we always get this triangle right here and so I'm interested in the angle formed between the two vectors. We'll call that the measure of that angle theta. Now in a standard trigonometry class, one often talks about the law of cosines which could be thought of as generalization of the Pythagorean theorem. So if we have an oblique triangle, one where the angles are not necessarily right angles here, we get something kind of like the Pythagorean theorem where if we take this side, u minus v, that is this side right here that's opposite the angle, u minus v, the length of this side squared will equal the length of this side squared, but then we have to subtract from it two times the length of u times the length of y times cosine of theta there. And so there's this correcting factor we have to play. Now of course if our angle is a right angle, notice that if theta equals 90 degrees, then cosine of 90 is equal to zero. And so if you have a right triangle, cosine of 90 is zero and it cancels off that correcting factor. And so the law of cosines degenerates to the usual Pythagorean equation. Now if you remember the proof we saw earlier of the Pythagorean theorem in the context of vector spaces, there is this part that creeped up inside that proof. Well I should say this part right here was creeping up in that proof. But if we were to carry through a similar argument, you can actually show that these two things are equal to each other, that u dot y is equal to the length of u times the length of y times cosine of theta. And so this is the correcting factor that comes from the parallelogram rule. And if we try to make that connection to right angles right here, if cosine is equal to theta, then well this right here would equal zero. And so the right hand side would equal zero, which would enforce that the inner product u dot y would equal zero as well. That is to say u is perpendicular to y. So this idea of orthogonality of vectors is equivalent to the perpendicular angles you might know from a previous geometry class. And so this really is a generalization of a lot of cosines one might have seen previously. And so also this equation right here can help us compute angles between vectors because if we solve for cosine theta on the right hand side, you'll get that cosine theta equals u dot y over the length of u times the length of y. And if you take our cosine, you can actually calculate the angle between vectors. So I want to do that in this two dimensional example. Take the vector u to be 6 and negative 1. Take the other vector to be 1 and 4 and these will live inside of r2. So by the previous formula, theta, the angle between the two vectors will be arc cosine of u dot v over the length of u times the length of v in this case. And if we calculate these things, the dot product of these vectors, you're going to get 6 minus 4 on top. If we calculate the lengths of the vectors, you get the square root of 36 plus 1 for u and you get the square root of 1 plus 16 for v. And so simplifying this a little bit, you get arc cosine of 2 over the square root of 37 times the square root of 17, which those aren't perfect squares so we can't simplify them anymore. We probably want to use a calculator to help us simplify this expression right here. And so using just your usual scientific calculator, I'm going to have mine in degree mode. And so this would give us as approximately 85.43 degrees. Your answer would look a little bit different if it's in radians. Cosine inverse will always output an angle, but it should accept a ratio that's between negative 1 and 1. So these two vectors are almost a right angle, about 5 degrees off. And that simple calculation can give us the angle between any two vectors. Alright, so I want to introduce another definition here. So if we have a square matrix, and let's make it be a real matrix, this first one here, so just real numbers. If you have a real square matrix, u, we call it orthogonal if it satisfies the relationship that u transpose u equals the identity. Alright, so the transpose of the matrix times the original matrix gives you the identity. Well, this statement right here, if you twist it a little bit differently, if you transpose u equals the identity, that actually would suggest that u transpose is the inverse of the matrix. Which of course, as the left and right inverses are the same, this also would tell us that u, u transpose equals the identity. This is what we call an orthogonal matrix. And so kind of see why someone might be interested in an orthogonal matrix. Remember how we did matrix inversion in the past? If you wanted to find the inverse of a non-singular matrix u, you would augment that matrix with the identity, and then you would row reduce that matrix so that u becomes the identity, and then the identity would become u inverse. And because there's a lot of raw operations that take place in the middle, this can be an expensive procedure. But in comparison, if you just have to take the transpose, which is essentially free when it comes to complexity, finding the inverse of an orthogonal matrix is a cinch, assuming you have an orthogonal matrix. The complex counterpart of an orthogonal matrix is what we call a unitary matrix. And as usual, whenever we talk about complex matrices, we never talk about the transpose. That's bad news for complex matrices. Instead, we want to have the conjugate transpose. And so a unitary matrix is a complex matrix, exactly where the conjugate transpose is equal to its inverse. Now, why the name orthogonal matrix? Well, it comes from this theorem right here, that if we have a square matrix, which is orthogonal, a real matrix, which is orthogonal, or a complex matrix, which is unitary, this happens if and only if the column vectors of the matrix form an orthonormal set. Remember, an orthonormal set means that every vector is a unit vector, and that dot products of vectors from that set are orthogonal. And so we call a matrix as orthogonal if and only if its column vectors form an orthonormal set. And that's why we call these things orthogonal matrices, because the column vectors form an orthonormal set. Now, because of this, people are like, why do we call it orthogonal? I mean, the set of column vectors like git is orthogonal, but it actually has to be orthonormal to be an orthogonal set. So why aren't these called orthonormal matrices? That actually makes a lot of sense. And some authors of linear algebra textbooks actually insist that these things are called orthonormal matrices for that reason, but the orthogonal labels used more commonly. So I think it'll do students a disadvantage if we use a different term here. So we have an orthogonal matrix, unitary matrix. Clearly, there's no confusion there. Now, whenever a matrix is orthogonal, since its inverse is its trampose, we get that every orthogonal matrix its transpose is itself an orthogonal matrix or its conjugate transpose we're talking about unitary complex matrices, I mean. And so because of that, it's also true that a matrix is orthogonal if and only if its row vectors likewise form an orthonormal set. And so you can look at columns or rows, and these are going to be orthonormal sets. So I want to give you an example of this type of thing. Here's a three by three orthogonal matrix. This is a real matrix. And if you look at this thing, you might be like, what the heck? Why are all these square roots showing up in this matrix here? Square root of 11, square root of 6, square root of 66, what's going on here? Well, remember to be an orthogonal matrix, we need columns to be or an orthonormal set. So each of these things has to be a unit vector. And if you were to take the vector 3, 1, 1, which didn't have the square root of 11 there, take its norm, you're going to get the square root of 3 square, which is 9, plus 1 squared, which is 1, plus 1 squared, which is 1. Okay, I get it. 9 plus 1 plus 1 is 11, square root of 11. So 3 over root 11, 1 over root 11, 1 over root 11, that's just the normalization of the vector 3, 1, 1. And similar thing for negative 1, 2 and 1. If you take that vector, its length would be the square root of 6. If you take 1, negative 1, negative 4, 7, the length of that vector is the square root of 66. So these are normalizations. Are they orthogonal? Well, if you take the dot products, I'm just going to ignore the normalizations yet. If you just take the dot product of 3, 1, and 1 with the vector negative 1, 2, and 1, you're going to end up with negative 3 plus 2 plus 1, which is 0. So that is, in fact, an orthogonal pair. And be aware that a skill in multiple of vectors that are orthogonal is not going to change the orthogonality condition. So the normalizations will also be the case. And if you take vector 1 times 3, that'll be 0. If you take vector 2 times 3, it's also 0. You can check that this set right here is an orthonormal set. So that's one way of showing that it's orthogonal. The other way to show that it's orthogonal is simply just to take the definition, take u transpose u, and show that this is equal to the identity. I'm not going to do that calculation here. I'm actually going to encourage the viewer to do that. So pause the video right now and check that if you take u transpose u, that this will give you the 3 by 3 identity, 1, 1, 1 on the diagonals, 0s everywhere else. Check that for us right now. I next want to give us an example of a Hermitian matrix, not Hermitian matrix. I'm sorry, a unitary matrix. A unitary matrix, this one's going to be the case where u star u equals the identity. These are complex numbers. We'll take the conjugate transpose here. So again, pause the video right now and double check that this matrix is unitary by the definition. Now another way to show that this is unitary is actually to check that this is an orthonormal set. An orthonormal set of vectors. Now with complex numbers, you have to be a little bit more careful. Remember when you take the Hermitian product, you have to take conjugate transposes. So if you take the first vector, let's show that it's a unit vector here. The length, you can factor out the 1 half because it's a real number that the conjugate doesn't do anything. So if you take 1 plus i, 1 minus i, take the length of this thing. You're going to get 1 half the square root of, well, 1 plus i times its conjugate is a 2. And if you get 1 minus i times its conjugate, that's also a 2. So you end up with 1 half times the square root of 4, which is 2. So you get 2 over 2, which equals 1. So the first vector is a unit vector. A similar calculation shows the second one is a unit vector as well. So going back here, if you want to take the dot product of these things, you're going to get the dot product is 1 half times 1 half is 1 fourth. And then you get 1 plus i, 1 minus i dot 1 plus i times negative 1 plus i. Now as these are complex numbers, don't forget to take the conjugate of the first factor right here. And so we're going to get 1 fourth times 1 minus i times 1 plus i, or 1 plus i. And then plus 1 plus i, its conjugate times negative 1 plus i. And so if we work out the details here, if you foil 1 minus i times 1 plus i, that's going to equal 2. But then 1 plus i times negative 1 plus i, that will foil out to be negative 2. So this thing adds up to zero when you're done. That is zero. And so you can see that this is an orthonormal set of vectors. It's what we call a unitary matrix. All right, why do people care about orthogonal or unitary matrices whatsoever? Well, I want to talk about geometrically what does an orthogonal, multiplication by an orthogonal or unitary matrix do? Well, if we'd have a unitary or orthogonal matrix U, basically I can interchange the words orthogonal and unitary, essentially interchangeably, there's no consequence to doing that here. Orthogonal for real and complex for unitary matrices here. If we take any two vectors, x and y, that live inside of our vector space fn, it turns out that the inner product between x and y is going to be identical to the inner product of ux and uy. So if you take the vectors x and y and you times it by a unitary matrix, ux dot uy will equal x and y. So matrix multiplication corresponds to linear transformations. Multiplying by a unitary or orthogonal matrix has the effect that, well, since it's linear, it'll preserve vector addition and scalable multiplication. But orthogonal and unitary matrix have the extra property that they preserve inner products. That inner products before the map are the same as the inner product afterwards. And the reason this is significant is because we preserve inner products, we also preserve everything related to inner products. So like we started off with this lecture, this will preserve angles. Angles between the vectors will be preserved because we can compute the angle from an inner product. This will also preserve distances because, again, the distance between two vectors, the distance of say x and y, this equals the length of x minus y, which is, remember, the square root of the dot product of x minus y with itself. And so anything that we define with a dot product will be preserved by orthogonal and unitary matrices. And so multiplication by an orthogonal matrix doesn't affect distances, it doesn't affect angles, it doesn't affect norms. We also have the property that if you take the norm of ux, this will equal the norm of x. The length of a vector doesn't change when you times it by an orthogonal or unitary matrix. And so this is actually a pretty useful fact right here. The proof of which is actually pretty easy to see. The idea is the following. Start off with, start off with the left-hand side, ux dot uy. And so this is going to equal ux transpose times uy. So I'm going to assume these are real. If it was complex, switch the transpose to a star. And so by properties of the transpose, since it's a shoe sock operator, you get ux, ut, u, and y. Well, since it's orthogonal, this is the identity. And so this just becomes x transpose y, which is the same thing as the inner product x dot y. And like I said, for complex matrices, we change the appropriate parts and we get the same argument here. Orthogonal matrices don't change inner products. A consequence of this, if we have two bases for our space fn, call them b and c. And these are two orthonormal bases. Then the associated change of basis matrix where you change from c to b. It'll also be an orthogonal or unitary matrix as well. So continuing on what we were talking about earlier about how multiplication by orthogonal matrix doesn't change the angles or distances. Can we generalize that principle a little bit? And so if we take what's known as an, let's actually define what's known as an isometry, or sometimes called a rigid motion. Now isometry is a function on our vector space fn to fn. So set the distance between the vectors tx and ty is the same as the original vectors x and y. So isometries don't change distances between vectors. It preserves distance. Preserves distance. Distances whatsoever between vectors. Now, the reason we take this, I mean, if you look at the word isometry itself, it comes from the Greek. It would translate as same measure, same distance. And so that's exactly what isometry does. It preserves the distances between vectors. It's like, hey, if you take what we just talked about a moment ago, multiplication by orthogonal matrices preserve differences. Oh, okay. Multiplication by an orthogonal matrix is an example of an isometry. Well, what else? What else can kind of preserve distances? Because these isometries are these motions that are these transformations of the plane or your vector space that don't distort anything. They don't distort the shape, the angles, the distances, all of that's nice and preserved here. All right, so orthogonal matrices preserve distances. What else does? Well, another example of an isometry would be that of a translation. So take a vector B inside your vector space and the translation associated to the vector B is a map from fn to fn such that x maps to x plus B. Geometrically, what's happening is the following. You have your vector x right here and you have some translation vector B, which you just add to it. And so you're just going to translate x right here to x plus B. So it just has the effect that if you take any point in the plane, it'll translate that point to some new point by this constant factor B. This is what we do by translation here. Now, I should mention that translation vectors are not linear transformations. Let's, oops, too far. Let's make a comment about that for a second. So a translation, a translation is not, is not linear, at least in general. I think I was going to talk about that later, but I'll just make the comment here. Translations are not linear when your translation vector B is not zero. And the issue is that if you take your translation of the zero vector, this will map to zero plus B, which of course is equal to B, which is not zero in this case. Every linear transformation maps zero to zero. Translations don't do that. So this gives us a translation that's different than linear transformations. But these are going to be isodaries. How do we know that? Well, the idea is basically the following. If you take the distance between two vectors, x and y, using this map, you want to take, this is the distance, the length of x minus y right here. But on the other hand, if you take the distance of Tx and Ty, this is going to be the distance between x plus B minus y plus B. And you can see that x plus B minus y plus B, the B's are going to cancel out. The translation cancels out and you get the same thing. So translations are isometries, multiplication by orthogonal matrix is an isometry. And we're going to see at the end of this lecture that essentially every isometry is just a combination of these two principles. So before we do that, let's talk about the titular topic for today, affine transformations. So we can generalize the notion of a linear transformation the following way. We have a map T from a vector space fn to a vector space fm. This is a map and we call it an affine map, not a linear transformation, an affine transformation. If there exists a matrix A, this is going to be an m by n matrix. This will correspond to these numbers right here. And so we have an m by n matrix and we have a vector B, which lives in fm, which is the co-domain of this map. So that the function Tx will map you to the vector Ax plus B. So x is a vector in fn, multiplication by a will translate this into a vector. It'll go from fn to fm. And then you can add a vector from fm to that. And this gives you a transformation. So you get Ax plus B. Now we've seen that linear maps, linear maps send x to Ax plus, just Ax there, right? Where A is the standard matrix of the transformation, linear transformation. So these affine transformations are essentially doing the same thing. How do you get something affine? You just add on this extra translation vector right there. And so in general, your affine map won't be linear if the translation's non-zero. But an affine map is linear if and only if the translation's zero. And so I want to take a look at an example of this right here. So let's consider the affine transformation associated, the affine map from R3 to R3 associated to this matrix and this translation vector right here. So if we take as an example T of what do we want to do? Let's take, I have an example here somewhere, let's take the image of 2, 0, negative 1. So what this means is we're going to take this matrix right here, times it by the vector 2, 0, negative 1. And we're going to add to it the vector we have right here, like so. And so what happens? Well, if we go for finger multiplication, 3 negative 1, 1 times 2, 0, negative 1, we will get the vector, what do we get? We get 6 plus 0 minus 1. If we do the second row, we're going to get negative 6 plus 0 plus 0. And if we do the third row, we're going to get 12 plus 0 minus 2. We add to this the vector 1, 2, 3. If we simplify the matrix product, we should get 5, negative 6, and 10. Add to that the vector 1, 2, 3, like so. And then the final arithmetic here, kind of the two vectors that you're going to get, 6, negative 4, and 13. And so the image of this affine map will be 6, negative 4, 13. Just a basic calculation one can go through in this piece. So calculation of images just adds an extra step of translation. Well, another thing is the vector, let's take the vector 2, negative 2, 4. Is this inside the image of t? Because even though we're talking about affine transformation now, the notion of image makes sense for any function. Is it in the image there? And so what that means is we have to solve the equation AX plus B equals this vector 2, negative 2, 4. Can we solve this? Well, if you subtract the translation vector from both sides, be aware this comes down as the matrix equation AX equals 2, negative 2, 4, minus B. If we are more specifically, we have to do the augment, we have to reduce the augmented matrix 3, 3, negative 1, 1, augment 2 minus 1 is a 1. So the coefficient matrix is just going to be A right here. Let me just copy that real quick. So we read to the first row, we get negative 3, 2, and 0. And then finally we get 6, negative 3, and 2. And so then here in the final column, you're just taking the target vector, subtracting from the B, negative 2 minus 2 is a negative 4, and 4 minus 3 is a 1. So it really just comes down to row reducing this matrix right here, which if we do that, you'll find out that the matrix A is actually non-singular, row reduces to the identity. And then what you end up with is 2, 1, negative 4. And so this matrix right here row reduces, and so we see that yes. The answer to the original question is yes. We do have that our vector Y. So Y is in the image of T. And we can see this because T of 2, 1, negative 4 equals Y. And I'll let you verify that fact right there. Pause the video if you need to, to double check the multiplication there. And so we can use a system of linear equations to help us answer questions about, well, is this thing inside of the image? Kernels don't really make much sense about all my transformations because the kernel is the set of all vectors that maps to 0. And because of the translation map, well, there's really just going to be one thing that maps to 0 basically. I mean, one could talk about it, but like with this situation right here, it kind of came down to the matrix A, right? You're solving that equation AX equals something, the original vector Y minus B. You have to solve that system. So if you're asking questions about like 1 to 1, is it 1 to 1? Well, since you have to solve this system A, augment Y minus B, if it's 1 to 1, it really depends on the matrix A right here. And if you're curious, is it onto, again, it has, it depends entirely on this matrix A right here. That is the system of equations. So much like how we did it with linear transformations early in this course here. If we want to show if it's onto, let Y be a generic vector, subtract from it B right here, and solve the system of equations. Which really, it's not going to matter, it depends entirely on A right here. This matrix will be onto, it'll be onto if we have a pivot in every single row. Pivot in a row of A. That'll make the affine transformation onto. What about 1 to 1? Well, this, it'll be 1 to 1, will there be multiple solutions? It'll have multiple solutions only if there are non-pivot columns in the matrix here, because the non-pivot columns give us free variables. And so this affine transformation will be 1 to 1 if it has a pivot in every column. So you can see what this matrix A, which is non-singular, this associated affine transformation will be both 1 to 1 and onto. Because they have a pivot in every row and column there. So in some respect, affine transformations are very much like linear transformations, right? You take X and you map to AX plus B. This will be linear if and only if the vector, the translation vector B is non-singular. It'll be linear if that translation vector is 0. Now it turns out you can extend, you can extend your space Fn into the larger space Fn plus 1. And with this perspective, every affine transformation can be visualized as a linear transformation in a larger vector space. And the idea is the following. Take the system, the augmented matrix, you can take A augment B, where A is your coefficient matrix for the affine transformation. B is your translation vector. And then you're going to add an extra row. So this thing right here, you're going to add extra row to it. And so on the coefficient side, you're going to add a bunch of 0s, 0, 0, 0, 0. And then on the augmented side, you're just going to add a 1 here. And so if A is a M by N matrix, this one down here, we have an extra row now. So you're going to get M by, sorry, M plus 1 by N matrix. And then there's an extra column to that. And so if you take this matrix and you multiply it by the vector X1, so you just add this extra 1 there. By matrix multiplication, you'll end up with AX plus B times 1, which is B. And then you'll just get, if you do the second row, you'll end up with just a 1 there. So there's this extra 1 that sticks on the bottom for the whole time. But you can actually do affine transformations as matrix multiplication. You just need this extra row that kind of acts as a placeholder here. And so because of that, this matrix right here, this augmented matrix, excuse me, this augmented matrix right here, this is what you refer to as the standard matrix, the standard matrix of your affine transformation. It's a little bigger than the dimensions of the vector spaces because you need this extra space. So if we look at the example we did before, you'll notice here is the exact same matrix A we had in the previous example. And then here's the same translation vector B. We just added this extra row, all zeros and then a 1. Don't panic about, oh no, this system's inconsistent. The line there is for organization purposes. The matrix A is over here and the translation B is right here. We're not trying to solve a system of equation. If you take this standard matrix and you times it by the vector, what did we have earlier? If you have the vector, what did we have before? I'm sorry. If we take the vector 2, 0, negative 1, 1, this right here. So we showed, if we take this right here, so we're trying to figure out what happens to this right here. So T of 2, 0, negative 1. If you do the matrix multiplication in this situation, you end up with the vector 6, negative 4, 13, and 1. I'll let you double check that positive if you need to. If you look at just the first three vectors, the first three coordinates there, you end up with 6, negative 4, and 13. So affine transformations can be turned into linear transformations if we enlarge the vector space. And affine transformations are linear transformations plus a translation. So in some respect, affines generalize linear transformations, but linear generalize affine, it's neither here nor there. So we're very closely related to each other. All right. And so to summarize what we've been talking about today, we'll get to this, our last theorem for this section here. The Mazur-Olam theorem here, which connects the notions of isometries with the orthogonal matrices we saw before. So remember, isometry is a function on a vector space for which distance is preserved. So if a map T is an isometry, then it turns out that it can be written as an affine transformation, ux plus b, where b is any vector for translation, and u is any orthogonal or unitary matrix. So every isometry is an affine transformation for which you are using an orthogonal or unitary matrix. And so one can then use this idea to characterize isometries for the plane that is an R2. And it's actually pretty impressive. One can show that in the plane, there's only four types of isometries. One is just translation for which u in that situation just be the identity. There's rotation around a point in the plane and reflection across a line in the plane. If you want to do, we've talked about rotations and reflections in the past. If you want to rotate around the origin, you just take u to be a rotation matrix, which is orthogonal, and you take b to be zero. And if you want to reflect across like the x-axis, you take u to be a reflection matrix like we did in the past, and then you set b to be zero. If you want to rotate around a point other than the origin or reflect across a line that doesn't go through the origin, you do have to have a translation factor into that. And then the last possibility is there's also something called a glide reflection. A glide reflection is kind of like you take footprints in the sand. So you get this alternating picture like this. And so a glide reflection is a type of isometry of the plane. This can be done with taking a translation, combining it with a reflection. If the reflection is parallel to the translation, then you actually can form these glide translations. And it turns out there's only four types of rigid motions to the plane. It gets a little bit complicated the more dimensions you get. Because like in three dimensions, you have glide reflections, rotations, regular reflections, translations and such. But you also get things like screw translations that as you can rotate around a translating line, kind of like if you were to put a screw inside of wood and such. And then there's some other interesting things there. I encourage you to look it up online if you want to. It's kind of a fun little topic there. But that actually concludes section 4.4. Thank you for listening. If you have any questions or comments, please post in the comments below. If you haven't already done so, feel free to subscribe to this channel so you can get further updates about linear algebra and other mathematical lectures that I have here at Southern Utah. And if you actually want to take a look at the book Linear Algebra Done Openly, look at the comments. There's a link in the comments below. I'll see you next time. Bye.