 So, today we will go with two topics simultaneously because they are inseparable to some extent. We will continue developing and finishing up our matrix algorithms, but while doing that we will learn how to do modular coding by using functions. And we have already snuck in a bit of function programming in earlier lectures, but now we start using functions in earnest and learn about how to pass parameters, how to return values, all the important details. So, to recap we had this first approach toward computing determinants using Gaussian elimination. And generally speaking the elimination process itself is very easy. You just zero out all the entries below the diagonal. The way in which different algorithms differ a little bit is how they do the pivoting in case they get in trouble with the diagonal element. And there are two styles. One is the easy style I started out with, where you start a matrix like this and then on the second row you cancel out the first guy, on the third row you cancel out the first two and you proceed like that. But lower rows have everything potentially non-zero. And the second approach is to start with the first row, zero out the first column. That's relatively more frequently taught because it is a little more resilient to pivoting problems. In general pivoting is more complicated and there are situations where one pivoting algorithm will fail and another won't. And in general to have the most powerful pivoting mechanism available you need to be prepared to commute both rows and columns, but we won't look at that in this course. If you're interested you can go to Wikipedia and read up about what's called LUP decomposition. So ideally when you fail to pivot in your code you should be able to say that no one can pivot any better than you. That the matrix is inherently with zero or very small determinant and it's not a limitation of your code. And to be able to do that you need to implement the fully generic LUP decomposition. So that we won't talk about at least today. So just to recap what we did you start with a matrix like that and our current goal is to take this yellow row and cancel out the first column of it. And you do that by looking for a factor of the first row which when subtracted from the second will result in A10 going to zero. So you do that, you subtract that multiple of the entire first row from the second row. And then you get a modified A11 and A10 has turned into a zero which is shown by the white cell. And then you attack the 2th row which is now colored in yellow. And now you want to cancel out A20 and A21. The factor required for A20 is very clear. That's similar computation as before. F0 equals A20 divided by A00. And then you solve this simple equation to decide what the factor F1 should be. And you keep doing that after the first two steps. This small triangular area has turned into white. And now you continue with the yellow row. Zero out the first three columns and continue like that. So we saw the code sketch. There's nothing particularly complicated about it except for a little care with the indices, fine? And then once the matrix becomes entirely upper diagonal, you can just multiply the diagonals to find the determinant. Now the only difficulty here is the puroting part. So we currently handle this by swapping rows. And anytime you swap rows, you have to adjust the sign of the determinant. So more advanced puroting methods exist. And sometimes the determinant may be zero, which means all puroting attempts, no matter how sophisticated, will fail. So the standard algorithm that you usually taught is slightly different, because it has slightly better puroting properties. Where we start with the first row, A00 and onwards. And our goal is to zero out the first column, now shown in yellow. So you proceed column by column. And so the loop looks like this for Rx equal to 1 through number of rows. The 1th or second row down to the bottom. The factor is again, A Rx divided by A00. Rx0 divided by A00. And then over all the columns running horizontally, we do this, A Rx Cx minus equal to factor times A0 Cx. So clearly once you do this, this particular darkened row will have its first element zeroed out. And then we repeat for further value. So in the second step, so once this happens, the first column, except for the diagonal, has become entirely zeroes. Now you attack the second diagonal element and the column below it. So I want to turn the three yellow cells into zeroes. Then I attack the third column and turn those two yellow cells into zeroes and so on. Now for this I need nested loops. Here the code was written only for zeroing out the first column. So now if I want to in general, zero out the dxth column. dx runs from zero to size one or size two. It's the same, the matrix is squared, so zero to n minus one. And then our goal is to zero out a dx plus one dx through a n minus one dx. So look at what's happening here. This is dx dx. That is dx plus one dx. This is the n minus one dx. So for rho to be zeroed equal to dx plus one, rho less than equal to n, I find the factor as before. And then I run through the column loop where I subtract fact times a dxx from rxcx. So this is rx, that is dx. Something here is rx, that is dx, whichever is the diagonal rho. So this loop is clear. And the only problem arises if I try to divide by a dxdx, which is too small. So this time I do the pivoting ahead of time. So when I enter the dx loop, I check if the modulus of a dxdx is too small and then I do a pivot. And the advantage is that at this stage in the code, my matrix is looking like this in general. So here is the, okay. And then everything below this has turned into zeros, okay. And the problem is that I want this element to be non-zero, but it's turning out close to zero. So now if this is dx, I'll try to do a pivot with some dx below it. And this time when I switch them, zeros will switch with zeros. So there's no extra work to do. And I'll be looking for a non-zero element here if this turns out to be zero. So the pivoting is a little easier. And more than being easy, not having to repeat the dxth row's work, you can actually show that for many matrices, if no non-zero is found below, then in general, rather for all matrices, if all elements here are exactly zero, then the determinant is zero. So there is no danger of saying, okay, I couldn't do good pivoting and so on. The problem with this algorithm is that if the determinant is not exactly zero but very close to zero, then this strategy doesn't give you the most accurate determinant. To get the most accurate and stable determinant, you need to be also able to permute columns, but we won't talk about that. But this is good enough for most ordinary users. Okay, so now we'll start writing this code and we'll hit this. We have already hit this problem that our code is getting larger and larger, the typical example. And I have to blow up the font size large so that you can see it clearly from the back. So I can store at most ten lines of useful code per screen full. And it's getting more and more difficult to demonstrate long algorithms to you, okay? And it's not just for that reason. Even when thinking about the code itself, you can think more clearly if you break up your coding style into modules of action. So you do small pieces of the work each in a function that roughly fits in one screen full, if not in the projector, at least on your working screen. Maybe 20 or 30 lines of code if possible. And then the advantage is that you separate the specification from the implementation. You basically say this function will swap two numbers for me. Or this function will swap two things of an arbitrary same type for me. And then you can reuse that routine from everywhere. Not only is reuse improved, but you write much more readable code because when you're trying to say swap rows Rx, Zx and dx, you write it like swap rows Rx and dx. You don't actually jump into the implementation and distract the code reader as to what exactly your intentions are. So you separate intention from how you do it. So we'll see examples of how that kind of code is written. And it's much, much more readable than things we have been writing recently. The other advantage is that now that you have separated out the action into different modules or functions, it's much easier to debug them separately. So you basically create, think of it like, you know, plumbing. You have taps, you have angle joints, you have pipes, you have valves. Since you manufactured them separately, you can test them for pressure and whether the washer is correct or not, whether the threading is done correctly, totally separately. When you're testing a pipe, you don't need to be concerned with whether the valve is working correctly. So all you need to ensure is that the threading is compatible. And that pipe correctly screws into a valve and the washer correctly goes between two flanges and so on. That's called interfacing. So when you declare a function, you have to declare its parameter list and its return type and its contract with the outside world very carefully. Just like when good manufacturers prepare taps and valves, they will control the tolerance of the diameters and the thread pitch very, very closely. So that if multiple companies are following the same standard, then their products will fit and work together perfectly well. So that's the culture. That's the reason why we write modular code and we write code to be reused by many different people. So ordinarily, other people who are just using your code should not need to look at your source code to understand what it's doing. The contract should be expressed much more clearly to the outside world. In a length of text, that's much shorter than the actual code. Otherwise, you have not really abstracted away anything, if it's necessary to read your code. So that's sort of the pitch, the motivation about why we do it this way. And for example, in getting started with all this pivoting and subtracting rows and everything, the very first thing we'll need again and again is the very basic swap function. And the swap function is actually very short, you just need three lines. Temp equal to a equal to bb equal to temp. But still, if you keep doing that, you'll allocate lots of temp variables all over the code. If you declare the variable by the same name of the temp variable, they might clash with each other, you never know, right? So it's a pain to write out the full code for the swap function all over the place, whatever you need it. So instead, suppose we want to declare a polymorphic swap function. Polymorphic means that it should work for variables of any type. I shouldn't care whether I'm interchanging two strings or two ends or two floats, it's the same story. So I start out saying template class t, which means that the following function is polymorphic, it should work for any type t or class t. And then I start out the function as in an ordinary function. I declare the return type, which is nothing, void. Swap doesn't return anything. And then I give it two arguments, a and b. And they're both references to variables of type t. You can't forget the ampersand. Otherwise changes will happen inside the method. We'll stay inside the method, nothing will be seen outside. We'll see demos of that and why that happens in a moment. And then inside the code is very simple. You say a variable temp of type t is assigned to a. A becomes b and b becomes temp, the same story. But now the advantage is that you can call this swap routine from anywhere on any pair of objects which are of the same type. So why do we need all these ampersands? Let's look at the other set of slides. So backing up for a moment, suppose I want to write this finding determinants by Gaussian elimination code. So code was getting too long and tedious. So the solution is to break it into smaller pieces. So we start off main and we say matrix AA. We read A from CN. And then we say C out determinant A end of line. So of course, determinant hasn't been defined yet in general unless you can get someone else's library. So what should determinant do? We have to define determinant before we can use it. Just like square root and log and power were all defined before we used it. Now it's our job to define determinant. So what should determinant do? Clearly determinant accepts as input a matrix of doubles. I'm passing it A. So to start with, I can say determinant takes input which is a matrix of doubles and returns a value which is a double. Now how is the function actually declared? So you have to first specify the return type double. Then you have to give the name of the function by which you'll call it from elsewhere. And then there will be this formal parameter which says that the input is a matrix called mat. As you have seen already before in code fragments, it doesn't matter what you call mat. Even if it's the same name as something you invoke it with, it's actually locally defined inside here. For example, when you're writing a math formula and you say something like F of AB, suppose you write integration A equal to minus infinity to infinity G of AB DA, say. This is a dangerous looking equation, Y, because it defines F as a function of A. But then inside it redefines A. So when you talk about this A, this is that A. This A has nothing to do with this A. So this is a somewhat meaningless expression because this A is not being used at all. So very similarly, whenever I call something, the name in the parameter list mat here, whenever you mention mat in the body of the function, it means this mat. Even if there is a mat available globally, that's not the mat that you access. So the general rule is that if you have a nesting in execution or in scope, like every curly bracket opens a scope, and you define a variable mat inside it, the inner definition hides the outer definition. Similarly, when you call a routine from a caller, even if it has a variable called A. So suppose a main declares two variables A and B, then it calls F A. But the definition of F may say int B. This B has nothing to do with that. When you call it with A, it's A's value which is past it. Even though the name B clashes, it's not really a clash. The innermost definition takes precedence. So coming back to this slide, you declare the return type. You declare the function name. And then there's a list of formal parameters. Here, there's only one element in the list, but in general, you have to state it one after the other. That ordering is significant. Whether you want to find A minus B is different from whether you want to find B minus A. So the list ordering is vital. So you could start off your implementation with that blue box. You can say double answer, and then you do some computation which fills in answer. Now finally, or even in between, if you hit an early exit situation, you have to return from the function. Because this function promises to return a double, whenever you say return in your function, you have to make sure that you return with a double value. Now here again, C++ and C are absolutely evil. In the sense that if you declare a function returning an int, and you forget to return a legal int, it will return a garbage int for you. Maybe zero, but maybe not. So Java is good enough to catch any place where control is falling through to the end without you realizing it. You have to return something on every exit path from that function. But C and C++ are not that careful. Yes? I don't know. So I just fell into this trap because I have not coded C++ for a long time. I didn't have return on some of the paths, and it just returned zero by default on GCD++, I think. So blocking every exit path from the function with a return is vital. Otherwise you might return weird things. This is particularly bad. Shouldn't have been this way. It's easy to prevent this situation. So anyway, as you can see here, at the end, I remember in determining the determinant, I had this sign variable. And your final answer could be answer into sign, answer times sign. So what you return doesn't need to be a variable. It's generally an expression. And that expression will be found by whoever is calling determinant. So that's the contract. That's the piece of plumbing you have to create. It's like the valve or the transformer. It has a contract of input and a contract of output. I want my input in a particular format. Maybe I want my inputs to satisfy some properties, and I promise to deliver the following output, which is the function of the input. That's the easiest way to think about functions. And the easiest functions to write and manage are those that do not change their input arguments. They compute a pure function of the input arguments and send it out. Now, of course, you can only do very limited things with it. In general, it's much easier to write code if your functions can modify the state of the world. But as soon as you start using that power, you have to be much more responsible about what you do. A function which changes the outside world has to be much more careful in writing down its contract with the outside world than a function which doesn't change anything outside. So this blue box is the implementation, and then ending with a return statement. So by default, in C and C++, parameters are passed by value, which means that an explicit copy is made. Space inside the function is allocated for every variable declared as a formal parameter. And when main initializes a to 3 and then calls fun on a, space is allocated inside the space of fun for x. This value of 3 is copied into x. And then the function starts with a cell called x initialized to 3. That's a different cell from a. And so when function assigns x to 5, that has no connection with that variable called 3. They coexist. When the function ends, the storage for x is destroyed. Again, nothing affects a so far. So when we print a at the end of this, we'll again get the value 3. So value 3 is copied from a to x. x is locally modified, has no impact on the outside world. And even if we rename this variable x to a, remember from that integration formula, it doesn't matter. I mean, I could have used any old variable here, like gamma, and consistently change this to gamma. The meaning of the right-hand side would remain exactly the same. Similarly, you can always systematically rename your formal parameters to not collide with the outside world doesn't matter. So even if x were renamed to a, you'd have exactly the same execution story, because the mains a and funds a would occupy different memory cells. They would not be in the same slot of memory. So that's of limited use. That's not very useful if I wanted to change the state of the outside world, which is a frequent requirement. So that's why C++ also provides the facility of pass-by reference. So the important difference is that big ampersand sign. So you have to say now that fund is a function which takes as input a reference to an integer. It's not an integer itself. And then assigns that to 5. This time, the value is not copied. What is copied is internally the address at which a was allocated. So this time, when you assign a to 3, and then you call fund on a, what happens is a's address is implicitly passed into fund. And now whenever you're accessing x, you are always accessing a itself directly. There's no copying involved. So this is pretty efficient. Even if you don't have any intention of modifying anything inside fund, if you do this with ints, it's not much of a gain, because passing an address is about as expensive as passing an integer. But if you're passing a string, and you have no intention of changing the string, there's no need to copy. So you can now pass only an address to the string. And that will save you a lot of time if the string is large. You have a character. Yes. They are all from the same byte buffer, which is your amp. It's just a little bit. It's trying to make an address of the same integer. Yeah. And what do you take to a pointer which is of character? Yeah, we are not discussing pointers at all. Pointers are references. These are not pointers. Pointers and references are different. And I don't want to at all get into what a pointer is before finishing up reference completely, because pointers are much messier. A reference is like, there's this variable a. It occupied address 45,000 in RAM. So implicitly, fund is said you want to operate on an integer whose address is 45,000. Don't think of it as a pointer yet. So give me a week. Next week we'll get into pointers, or week after. Then we'll see what the difference is. So a reference is just where in memory a has been allocated. That's all. Now that fund knows it, fund can modify its state. So in this code, when x is set to 5, you are effectively changing a to 5. And this time, you will print out 5. If you want to pass things by reference for efficiency reasons, but you have no intention of changing x inside fund, you can say const int and x. Then if you make a mistake, and in the body of fund, you accidentally try to change x, the compiler will flag an error. And we'll not let you do that. So it's a safety device declaring something as const in this case. So let's now look at a very simple demo. So the first function, I'm passing things by reference. So int and x, I'm going to increment x. Remember, because this is passed by reference, I'm actually not incrementing x, but incrementing the cell whichever cell was referenced into x. And then I'm printing out the name of the function, the address at which x is allocated, and the value of x itself, and a new line. Meanwhile, the second function, which is by value, doesn't have that important ampersand. So this is a copying semantics. So the value will be copied into x. A separate cell will be allocated for x. No separate cell is allocated for x here. I'll again increment x. After that, I'll output the name of the function, the address at which the cell for x has been allocated, and the value of x itself, and new line. Now here is main, very simple. I initialize A to 5. In the main routine, I print out at what address A has been allocated, followed by the value of A, which should be 5. Then I called by def. Then I print out again the value as found inside main. Then I called by val. Then I again print out the value as found in main, and I finish up. Everyone clear with this code? First by difference, then by value. Each of them increments x by 1. So main says that A was allocated at this address and had value 5. Then control entered by def, where x was incremented by 1. And then I print the address and value of x. So I find that the value has become 6 inside by def. And note that these two addresses are exactly the same. So it's the same cell in RAM. And that's why 5 was increased to 6. When control goes back to main, I find that the value of A has indeed been incremented. Main says 6 as well. So main also says 6. Now I call by value. And by value says that x has addressed this, which is totally different from that. So that means that inside by val, a different cell has been created which stores x. Now initially the value of that cell was set to 6, because I called by val on A after the increment. So by val happily reports that it has incremented the integer to 7. But note that after it returned to main, there is no existence of by val's x. And therefore main still returns, reports a value of 6. The key is that two different cells are involved here. The cell called A in the main routine, and the cell called x in by val. Those are two different addresses. However, the cell called x in by ref is exactly the same as the cell called A in main. So that should totally clarify what's going on between value and ref. And both have their places. If you want a pure simple function with small arguments passed in, it's much easier to avoid this risk of reference and updates. If you explicitly want the function to change the state of the outside world, then you want to use references. Now another use of ref, apart from the efficiency of not copying in the inputs, is that suppose you are inverting a matrix. You could always say matrix of doubles x equal to invert matrix of doubles y. That means that the function called invert has to allocate a new matrix and return that. You can do it, but that's inefficient. It's much better to allocate the space and pass it in and let it update it in place. That's another reason why people use by reference. So that any parameter which is to be modified or passed out is passed by reference. Any parameter which is a pure input is passed by value. So now one important caveat in using references is the so called aliasing problem. You need to be very careful about aliasing while using references. So take this following function. So in the early days of the course, we said that there are these nifty ways to swap two variables without using a temporary. And you could do it with XORs or you could do it with minuses in case they were arithmetic types or numeric types. So you say swap in tempers and a in tempers and b. So clearly swap is a routine where passing by value is useless. I want to swap the state of two variables outside the function called swap. So I have to use references. So I say swap int and a, int and b. And inside I say a equal to a minus b, b equal to a plus b, a equal to b minus a. So how many people are already familiar with this form of swap which doesn't use a temporary variable? By now, all of you should have figured out how to do it because I mentioned it in class, that you can do it without a temporary variable. So let's trace what happens if I call swap with values 3 and 5. After the first step, a is changed to 3 minus 5. And b is still 5. So it's minus 2 and 5. After the second step, b becomes a plus b. So a remains minus 2, b becomes 3. And after the third step, it's a equals b minus a or a equals 3 minus minus 2, which is 5. So here is a spiffy way to not use a temporary and flip the two values. It's a very common interview question or used to be a common interview question. How can you swap without temporary? So it's nice. And so if you pass parameters 3 and 5 to it, swap will work correctly. But what happens if I, for some reason in the code, the two values just happen to be the same cell? So I have a single variable called x, which is set to 3. And because of some boundary condition or I didn't want to write the dx-dx loop separately from the dx-dx plus 1 loop, I just invoked the routine swap on two variable names or the same variable name twice over, swap xx. Now, ordinary understanding of natural language suggests that if you call swap xx, nothing should happen. Swapping a variable with itself shouldn't change its value. But let's see what happens in this case. I start out with values 3 and 3, but remember they are the same cell. So at the very first statement, when you do a equal to a minus b, they both become 0 effectively. Both a and b, as seen inside the swap routine, become 0. And thereafter, the remaining two statements just juggle the 0s around. And then they've all been set to 0s. So the correct operation of swap will now depend on whether the two inputs are the same cell or different cells. So references can introduce such tricky spots. So you need to be careful how to avoid this if you need to. So use references with care. They are efficient. They can change the state of the outside world. They're easy to code with, but they have this one caveat that aliasing can give you wrong results. So let's come back to how functions actually execute a little later. First, let's get back to our matrix examples and finish that off. So we declared the swap routine. We are not being adventurous or smart. We are just doing a dump swap with a temporary variable so we have no such danger. It will work in all cases. So the next thing we define using swap is this small routine called swap rows. What does swap rows do? Given a matrix of T's, again, it's a generic function. It can take in a matrix of any type, T. I give it a mat, and I give it two rows, r1 and r2. And what it does is it goes to the columns of the matrix, cx equal to 0, cx less than size 2. And it says swap on mat r1, cx, and mat r2, cx. Very simple to read. So here is another reason why looking for aliases and writing alias-proof code is important, because in some elimination routine or inversion routine, you never know if r1 is equal to r2. If r1 is equal to r2, you don't want to mangle the two rows or the one row. You want it to nothing to happen, basically. Now, of course, in this case, you could check it. You can first say if r1 equal to r2, then return immediately. But sooner or later, you might make a mistake. So that's swap and swap rows. So now, let's see how to pivot. So pivot if needed. So you pivot only if required. Since I'm going to check anyway, let's push it inside the routine. So the input arguments, first I input a. But remember that a is not a pure input. If I need to pivot, then I have to change its state. Therefore, and also because a is a large matrix, I pass it by difference. And by the way, I forgot to mention that if you have multiple parameters in a function, you can always pass some of them by difference and some of them by value. So I'm going to pass the first parameter a by difference, because it's both an input and output variable. If I actually do the pivoting, I need to change the matrix. The row and column or the diagonal at which I pivot dx is a pure input. So I can just say int dx. And finally, I have to also pass in this variable called sine. If I actually end up pivoting, I have to negate sine. And therefore, sine is going to be changed. And so it's also an in-out variable. So it's also passed by difference. So it's double and sine. What do I do first? I check if the absolute value of adx dx is larger than some magic epsilon. If that is true, if the value is already large enough, then I just return true and do nothing. So the agreement, the contract with the outside world is that if adquired no pivoting or I successfully did a pivoting, I'll return true. If I fail to find a good pivot or any pivot, then I'm going to return false. This will alert the outside world about what's going on in the pivoting step. And it will help the caller to take the proper action after the return. So now I start looking for possible candidates, rx, for the pivoting. So rx equal to dx plus 1, rx less than the number of rows plus plus rx. If absolute value of a rx dx is greater than epsilon, then I found a pivoting row. So I call swap rows adx rx. So observe that this already makes the code much easier to read. So if the absolute value of rx dx is greater than epsilon, I just swap rows adx rx, or rx dx, doesn't matter. And then I negate sine. I say sine equal to minus sine. And then I return true, saying I have successfully pivoted. At the end of the loop, if I exit normally out of the loop without an intermediate return true, it means that I have run through the course dx plus 1 through size 1 minus 1. And I haven't found a single row which I can use. Therefore, this is a failed pivot. So I return false at that. So in summary, I return true if and only if pivoting was not needed or it was successful. And in the process, I modify a and sine. So that's pivot if needed. So I've defined several functions so far. And it sometimes helps in large pieces of code. If you write down what's called the potential call graph. So main calls say determinant. Determinant calls pivot if needed. Pivot if needed calls swap rows. Swap rows call swap. So when you're reading someone else's code, it often helps to list all the functions that define and the relationships between them. Now in this setting, this is called the caller when I'm talking about this edge. And this is called the callee, who is doing the calling and who is being called. So before we get to solving a system of equations, let's look at the modular form of the code using all these functions. So first I define this print function. And convention in C and C++ is that if you want to define a function later after using it, you have to at least tell the compiler what the signature looks like without the body. And then later on you can fill in the body by declaring it again. Now this is the pivot if needed function. It's exactly the same as before. All I'm doing is inserting some print statements. Nothing serious. And if abs and then z is OK. So this is written in a slightly bad way with zx and so on. So you could do it that way also. And I'm not using. I'm actually explicitly doing swap using temps. And after that determinant itself, it initializes sign and then does the pivoting. So we have seen this last time. So determinant works. But now let's go on to Gaussian elimination. Now what changes in case of Gaussian elimination is as follows. So in case of determinants, I was only given one square matrix. And I had to keep track of the sign. I had to subtract multiples or I had to swap. In case of actually solving a system of equations by Gaussian elimination, I'm actually given an a. But I'm also given a y. And I have to find x which makes ax equal to y. Now everything I was doing here, I continue to do there. But there are a couple of differences. First is, sign can go away. I don't need to take care of sign. I'm not finding the determinant. Exchanging two equations doesn't change the system. So I can lose track of that. But the new important thing is that as I am subtracting multiples of rows of a from other rows of a, or I'm swapping rows of a, I need to do the corresponding changes to y. Otherwise the equations become unhinged, unganged. So there need to be stay ganged at all times. So I will redefine some of this pivoting and the linear subtraction to take into account y as well. Now obviously, when I do this subtraction of a, r, c, minus equal to factor times stuff, I also have to do it to y, y, r, minus equal to same factor times y, y, d. So that I have to write. So y is also going to be an in-out variable, passed by difference, to pivot, just like a was earlier. So again I start out defining swap. First, very simple. Swap rows call swap. Pivot if needed. So what does pivot if needed look like? Sign is gone. Sign is no longer a parameter. The return type is bool. Same meaning as before. Pivoting not needed or successful. And inside I now pass both a and y by difference, because they're going to be changed in synchrony. Now if absolute a dx is greater than epsilon, I return true without doing anything. No pivoting is required. Otherwise, starting at dx equal to dx plus 1 to number of rows, if I found a pivot, I swap rows of a, zx and dx, and I swap y zx with y dx. And I return true. So pivot if needed is almost the same. But observe that the call graph has now changed so that pivot if needed also directly calls swap. And we're not talking about determinants now. We're talking about, say, solve. Actually it should be make upper triangular. And later, after a and y have been modified so that a has become upper triangular, the second function that main needs to call is solve upper triangular. So that will be the more complete call graph, where main first calls make upper triangular, which calls those, and then finally, main's called solve upper triangle. So in designing code, it also helps that you first write out these call graphs and give names to your functions. And then you fill in what parameters you need to pass along them on paper. And then finally, we implement them on the other, as shown in the code. Yes? You could do it that way also. But what you're trying to say is that if I have to do something long and complicated, I'm going to take off this chunk of it. I'm going to make it a separate routine. I'm going to call g. That's like chaining actions. Chaining is OK, but culturally you should think of breaking down the problem rather than chaining it. That often helps. So pivot is covered. So everyone comfortable with what pivot is doing? There's only a very small change. Sign is not taken into account anymore. And y is passed in and swapped in tandem with a. Now what does make upper triangular do? So make upper triangular starts with dx, the diagonal that I'm running down. dx goes between 0 and n, size 1, size 2, they're all the same. Now first, I call pivot if needed with a, y, and dx. And I save the return in a Boolean variable called status, whatever happened to the pivoting. If the status was false, then something bad happened. I couldn't pivot. I cannot continue. So I percolate up that status as false to make upper triangular. See that make upper triangular also returns a Boolean. Clearly, I've not documented it here, and we should if we want to pass on the code to anyone. Make upper triangular returns true if the matrix could be successfully made upper triangular returns false otherwise. So in case the pivoting failed, I would not succeed. And so I return status if status is false. I could also return false itself. And then I zero out the column below dx. And I do that in the usual way. Rx equal to dx plus 1 to rows, size 1. Find the factor, subtract. But remember that after the column loop is done, I have to remember to also subtract the same factor from y. That's the important new line here. That line changes y in tandem with x, with a. So that is the new make upper triangular function. So pivoting y in sync with a highlighted the new statement, swap y dx with y rx. And the nested loop now has to include a statement for y. y rx minus equal to fac into y dx. Same code with highlights. And the last step, this is the solve upper triangular. That just solves this equation here. So it solves backward from n minus 1 down to 0. And thus the value of xi involves all xj larger than i. So the j is larger than i. That's the code here. So let's look at the demo. The solve upper triangular assumes that it's given an upper triangular matrix u. And it has a y. These are both inputs only. But I'm passing them without copying by saying const. The solver doesn't change u and y in any way. And x is filled in. So vector of doubles, I want to output it. Therefore, it's not a const. And it's passed by difference. And then that small loop is implemented here. I run down from n minus 1 to 0 solving using u and y. And then what does men do? Men reads in a matrix, prints it, reads in y the right hand side, and prints it. And then it first calls make upper triangular, gets a status. So this is part of the reason why you don't want chaining. You want some controller to actually invoke all the pieces. So start-shaped code is easier to read than chain-shaped code. So it makes upper triangular. If that succeeds, then it prints out the upper triangular matrix. And then it solves the upper triangular system with a and y as inputs and gets x as output. And it prints out x. If it didn't succeed, then it says a is nearly singular. Or a could be nearly single. To be really sure, you have to do the best pivoting possible. We're not really doing that. So everyone comfortable with Gaussian code? See, if I wanted to write this as a monolithic function called main, it would look so messy. And you'd completely lose your heads thinking, what's happening? What are the indices? How many nestings are there? Where are all the temporaries? But all that has been hopefully rendered a lot easier to understand, because we broke down the problem into swapping single things, into swapping rows of matrices, doing the pivoting itself, turning something into upper triangular, and then solving an upper triangular system. So we declared five functions. Each of them was manageable, I hope. And then we stuck them together in that pattern to do our larger task. So we'll just get better and better at this. And for example, the Linux operating system has tens or hundreds of lakhs of lines of code. Surely if you had to hold all of that in your mind at one time, your brains would explode. So people break it down. People attack the problem by divide and conquer. They divide the problem into pieces and conquer them one by one. So if I run gaussian.cpp now, so let's show you the output for gaussian. So here is the simple 3 by 3 matrix. Here is the right-hand side y. That is a. This is y. And I know the solution is 3, 1, 4. I took three variables, gave them values 3, 1, and 4, made up three non-linearly dependent constants. And I know the answer already. So I compile the gaussian code. And then I run it. And remember in main, it seems like I'm reading c in and outputting to c out. Now there's this device called a pipe in Unix. Pipe is done with greater than and less than characters. If I do that, it means that run a.out. Whenever a.out reads the keyboard, instead pass it characters from this file. So when I do that, I get this output. a was the same thing as I created in the file called gaussian.dat. y was the same as red from that same file. u, the upper triangular matrix, reads like 1, 2, 4. That's the same first row. But then it has become 0, 1, minus 1, minus 5. And the third row is 0, 0, minus 2. So it is upper triangular. And the solution is as I designed 3, 1, 4. So things are working correctly. Now you could also do the following. You could output whatever was being printed to c out by piping it out, say to a file called temp gaussian.out. So whatever was thought about as the keyboard becomes this input file. Whatever was thought about the screen becomes this output file. So if I do this, you'll see nothing on screen. All that output actually went to a file called temp gaussian.out. How do I see that? I say cat for concatenate. The same file. Now if you cat it, now you see that file. So piping in and piping out, keyboard can actually be substituted by a file. And the monitor itself can be substituted by another file. So far, clear? So gaussian elimination should now be fairly doable. And we're doing a decent pivoting. So generally speaking, you won't fall in trouble. So this is a fairly usable linear system solver for you. You can use it for whatever you want. Now we'll talk about inverses. Now clearly, if you can solve one linear system AX equal to Y, inverting a matrix is no big deal. Because we can conceptually just invoke the solver N times. The first time we solve it with the first column of the identity matrix. What we are looking for is a solution to, so suppose I want the inverse of A, let me call it B. I want A times B to be equal to the identity matrix. In other words, I would like A times the first column of B to be equal to 1 followed by all 0s. Let's call this B dot 1. Dot means all. 1 is the first column, or 0. Then I want A followed by the first column of, 1th column of B to be equal to 0, 1, 0, et cetera. And so on. So I can solve these systems N times over. And I can collect those Bs into a single matrix. That would be the inverse. So first with Y equal to 1, 0, 0 transpose. Then with Y equal to 0, 1, 0 transpose. And finally with 0, 0, 0, 1 transpose. Now obviously, we don't want to triangularize A N times over. That's a very inefficient thing to do. The same work is done multiple times. So we'll just pass in all the Ys in one shot. So earlier, in the pivoting step and the subtraction step, we are passing in a one Y column vector. Now we're going to pass in all the columns in one shot. So let's see how that code works. By the way, there are two matrix inverses. Matrix inverse 1 dot cpp. Does it the brute force way by actually solving a system with every column one by one? But today I'll only discuss the better one. Matrix inverse 2. Here's another example that you can actually use multiple namespaces provided there is no collision. If different namespaces use different names inside, nothing inside the namespace std defines the same thing as inside this namespace. Then you can use multiple namespaces. Makes life easier. So again, I write this generic swap. I just named it differently. I don't know why. Same as swap before. A print routine for matrices. Swap rows just like before. Those don't need to change at all. Pivot if needed. That is now going to change. Instead of having just A and one column Y, I'm going to get another matrix of doubles. This is the right-hand sides. All those right-hand sides. That's why I call it R. And the pivoting diagonal DX. Those are the parameters now. DX is input only. A and R are input and output. So what do I do here? Pretty easy. This time I don't do swap on Y. I again have to do a swap rows on R itself. So here I was doing swap rows on A and swap on Y. Here I'm doing swap rows on A and swap rows on R also. And otherwise, fail to pivot. Exactly the same code. How do I make upper triangular? This also changes signature because instead of Y, another matrix R is input. So let's compare line by line. So this used to be a vector of double Y. This has become a matrix of double R. Inside it's very similar. AY DX was being used for pivot if needed. Now this is AR DX. Nothing changes. And then make A through this zeroes. Same logic. Here I'm doing this there. And I can just reuse the column loop and tuck in the R operation inside that same column loop. So whatever I was doing to pairs of Y elements, I have to do it across the columns of R. So that's all. So I fold that into the CX loop. And I do the row multiple subtraction for both A and R together. So complexity-wise, nothing much has changed. I mean earlier also I was doing n things. I'll do two n things. So it's just twice as expensive. So inversion is no more difficult than solving one linear system. In the end, the time taken is very comparable. And I'm printing out things after the pivoting, et cetera. And then solve upper triangular. See, again, this takes an R, the initial R, which should be the identity matrix. So here solve upper triangular used to a Y. Here it's R. And how does this work? So this needs a little bit of discussion, perhaps. So if you look at this story, so once I turned the matrix A into upper triangular form, and I'm looking for this X, and I knew a Y. So this was known, this was known. I worked my way backwards. First I found this using this and that. Then I used this equation to find that. And I worked this way. All that changes is now this has become a matrix. This is the same U for all of them. This is the inverse that I want to find. And this is I. So it will be transformed while doing this thing. So again here, I will first solve the last row. This is the CX. Instead of solving one thing here, I'll solve the entire last row. Then I'll solve the entire second last row and work up in this matrix. And this will finally turn out to be the inverse. So all that changes in the code then is that there's the CX loop outside. Remember, in the old code, there was just the IX loop. I walking back from n minus 1 to 0. Now there has to be both a CX loop and inside that IX loop. Other than that, it's absolutely similar. There's no difference. So you can look at this a little more carefully offline. But once we have all these pieces, it's very easy to write up main. I read A. I save it for verifying purposes. Save A is just A. So I can also say equal to A. Then I set up an R, which is the same size as A. A is squared anywhere. And I initialize R to the identity matrix. How do I do that? For every row, for every column, I first set it to 0. And then I set the diagonal to 1. So R starts out being the identity matrix. I print out for verification. Then I make a triangular while also passing in R to do suitable row transformations. If the status of making the triangular matrix succeeded, then I am happy I go forward. I print the upper triangular matrix. Now I call it U while printing it. Then I print R to see how it has been mangled on the way through row operations. And then I set up a matrix called A inverse, same size as A. Then I solve upper triangular. I pass it A, R, and A inverse. Then I print out A inverse. And finally, I have this saved A. I multiplied with A inverse to verify if my inverse calculation was correct or not. And then I print out A inverse A as a matrix. We'll see what that prints. And I finish up. Otherwise, I print A maybe nearly singular. And I don't do anything. If the upper triangularization does not go well. So is this too fast? Or are most people still tracking this code? Show of hands, please. Are you reasonably comfortable with how this coding is going? The summary is that I again started from generic swap, swap rows, pivoting. I just kept modifying it slightly from step to step. How to make a matrix upper triangular while modifying its right-hand side for the inversion properly? How to solve the upper triangular system, giving the inverse? So let's run matrix inverse and see. So I'll pass it various easy things. So there's this file called easy.dat. The matrix is 1 minus 235. And so I cannot overemphasize the importance of small examples and testing your code. To this day, when I write my code, the first version is always buggy. And the only way to make it less buggy, you'll never make your code non-buggy, is to write out pencil and paper examples and run it through the code and check that you're getting all the correct values at the intermediate set. There's no bypassing this. You have to design test cases and run test cases and see by hand that the code is doing what you want. So the matrix is 1 minus 235. This is A. So working out longhand without depending on fancy row operations, A inverse is going to be 1 over AD minus BC, which is 5 plus 6. What's inside? D minus B minus CA, where plus 2. So let's see. But meanwhile, inside what should happen is the triangularization actually gets 1 minus 235. And it also gets 1 0 0 1. This is the A part. That is the R part. We can see what happens. So suppose we take 1 minus 235 and multiply it by 5 minus 321. We get 5 plus 6 11. This is 0. That is 0. And we get 3 6. We get 11 again. Sorry, it is fine. So we start out with this combination of A and R. And there's only one row subtraction that will happen. We don't need a pivot. So what happens? The factor will be 3. I have to take away three times the first row from the second row to zero this out. And therefore, this will become just 11. Sorry, 5. Yes. So this will just become 11. That's what's supposed to happen. So what should happen is this will turn into 0. That will be 11. This will also be 11. So 1 minus 235 R started out being the identity matrix. And then pivot on 1 doesn't mean anything here. After pivot on 0, I get 1 minus 2, 0, and 11. Pivot on 1 doesn't need to do anything. So the u is 1 minus 2, 0, 11. And R becomes, yeah. So this will be minus 3, 1, 0, minus 3, and 1. And A inverse will be found to be what we thought, 5 over 11. This is 2 over 11, minus 3 over 11, and 1 over 11. And A inverse times A is found to be 1, 1, and then very tiny things on the diagonals for numerical reasons. How about some more difficult things? Determinant 1 dot that. Remember, this was a 4 by 4 matrix. We knew it was non-singular. Determinant was coming out to be non-zero. So this should work. So it prints out a lot of stuff while doing the pivoting. Don't need to look at all that. At the end, A inverse times A, that's what's important. It's all 1 on the diagonals. And the off diagonals are all tiny numbers. How about the second 3 by 3 example we had with the pivoting problem in it? Remember, this hit a problem during pivoting. So if we feed it this matrix, it's immune to at least this kind of a pivoting problem. So it actually does a swap after pivot. The summary is that A inverse A is actually the identity. So far, we have the summaries that we started out with the matrices. We learned how to do operations on them. And then we started making our code modular as we coded up more and more difficult things, finding determinants, finding solving systems of equations, and finding inverses. So now, I'm going to spend the remaining time in talking about how functions are actually executed on your computer's hardware, because then we'll have to go into recursion and other things. So what actually happens? If you have a main method, you initialize A to 3. And then you call fun on A, and you output A. What actually happens in the machine? So just before executing the implementation of fun, the computer has to somehow remember what to do after fun returns. That's the tricky part. And the reason is that, as we've already seen in our example code, you could first call f, f can call g, g can call h, and they can return. So you have this complicated protocol of having to remember what to do, what's left to do after the function returns. Now, let me not get into this stuff. Let me give you an example first. So while I'm reading a paper with a student in my office, I'm reading the paper. Suddenly the door opens, and an office staff comes in saying, I have to get this paper signed. So I place those papers to be signed on top of the paper I was reading with the student, and I begin to sign when suddenly the phone rings, because this is my life. The phone rings, I put down the pen, I take the call, I speak, hopefully, here, and then I hang up the phone. Then I finish signing the papers, then I give it to the office staff who goes away, and then I return to the student meeting. This is called the last in first out protocol. It's not particularly fair to the student, but this is the only way to reserve sanity. So finish off something small, let it go away, then resume whatever you have interrupted. So function calls in any language, C and C++ in particular, will happen in exactly the same way. So we have this joke about certain faculty members that the best way to get what you want from them is to actually interrupt a large meeting, and then quickly get away with it. I'll not name them. So how does the computer keep track of what remains to do? The main device that's used is called a program counter or the PC. So internally, the story of the PC is very complicated. So I'm trying to give you a first gentle introduction to the PC. So suppose here are this kind of statements. So you read C in into a variable called FA, then you have V2 equal to something minus 32 is Fahrenheit. V2 is Fahrenheit minus 32, V3 is V2 divided by 9. So I'm roughly speaking writing the code at a register level. Not too much is happening in every statement. One concrete operation is happening in every statement. Then I say if centigrade is less than 5, then you say cold, and then after that you calculate the Kelvin temperature as centigrade plus 273. Now let's assign an ID to every such operation that I'm doing here. So the first statement of reading in the value is called statement number one. Second one is called statement number two. And seven is not really a real statement. The only thing seven might have done is deallocating storage and anything you declared here. So we'll forget that. Now the CPU actually has an extra register called the program counter. That always stores the ID, the red number that has to be executed next. And in ordinary operation, as the CPU goes through statements executing them one by one, that counter is incremented by one normally. But sometimes that counter can do funny things. For example, this if statement may result in six being executed or not executed. If six is not executed, then we'll skip it. So the PC will trace values 1, 2, 3, 4, 5, 8. Because this test turned out to be false. If the test turned out to be true, then the PC will trace the path 1, 2, 3, 4, 5, 6, 8. Seven is not a real statement here. Now how about loops? Suppose I have this code where you read in n from CN. And then you initialize sum to zero. And then you keep on summing up squares. Sum plus equal to i into i. And then you output the sum. If the input is four, then the program counter will take values 1, 2, 3, 4, 3, 4, 3, 4. Four times, finally with six. So the history of the program counter gives you a full trace of what happened in the code. So actually the PC needs to count the statements at a level of single instructions, not whole C++ statements. Because you could write a complicated expression. And you may need to load it into registers. So the actual PC implementation is much lower level. I'm showing it at a higher level just for simplicity. Now a key use of the PC is in implementing function calls. So let me show you what happens. Suppose I have this code where I declare x equal to minus 3, y equal to 5. And z equal to absolute value of x minus absolute value of y. That's the code. And abs is implemented here, int abs, int a. Return a equal to 0, then a otherwise minus a. So first, s1 is executed. Then PC is incremented to s2. Inside s2, a is bound to the value of x, which is minus 3, copied from x. We've already seen that. And then control goes from this point to abs. It executes that statement and returns it right here. Abs is executed, returns 3. Now for the second time, a is bounded to the value of y, which is 5. And control again goes from here into the same routine there. It comes back with value 5 this time. And finally, the computer has to remember to subtract 5 from 3 and assign it to z. So minus 2 is assigned to z. Execution of s2 is completed. And then control is transferred to s3. The point is you need a mechanism to remember this pending work. And that mechanism is called the stack. So remember the stacking papers. So similarly here, the program counter will be saved on a stack to remember where to return to. So that we'll discuss in the next lecture.