 Today, we are going to continue with functions. In particular, first we will stress on how functions are executed. What is the relationship between the caller of the function and the function, which is called the callee. How do they exchange data between them, parameters, return values, and how is control managed? So, how do you remember what you are doing? Jump into the function, execute the function, go back to whatever you are doing. How is that managed? So, we will look at that in detail. Then we will look at separate compilation where you do not put all your code in one file, but you split them across multiple files because the project is large or you want to reuse parts of code across different projects or whatever be your reasons. And then we will see an interesting example of all this in an image processing program. Some of you have used Gimp, the image processing tool on Linux. And some of you have used Adobe Photoshop, which is a Windows application for manipulating images. So, you know that you can do brightness and contrast correction and you can crop the image and do all kinds of things. So, today we will see two examples of image processing. One is detecting edges in the image, foreground, background, where are the boundaries of objects in the image. And the second is contrast correction and enhancement. So, if your photo was taken in suboptimal or less than ideal lighting situations, how can you correct the brightness profile or the contrast profile of the image? If we manage to, you know, squeeze all this through, then we will talk about make, which is an essential tool to manage separate compilations. If you split up your code or project into multiple files, it's much better that you use make than try to manually keep things together. And recursion will probably go to the next lecture. So, just a very quick summary before we get into the execution of function. So, a function is declared by a return type, the name of the function, and a list of formal parameters, which are bound to arguments when you actually call the function from what's called a call site. The same function can be called from multiple sites. For example, in the lower part of the code here, you see that apps is being called twice, one on x, once on x, and once on y. Those are two distinct call sites, but to the same function. Now, and we saw how to pass parameters to functions. The default in C++ is pass by value, which makes a copy of the callers state into the callies variable. And what the cally does inside the function, remains inside the function. That doesn't reflect in a change of variables in the, variable values in the outside world. If you want that to happen, you have to use what's called call by reference, where you have to specify that x is not actually a copy of the value in the callers state, but it's actually a memory address, which refers to the same cell as the caller. So, x and a are here aliases for what is the same memory cell. And that's why changes you make inside fun will reflect in the main's variable a, when fun returns. So, this far we have seen, and it's fairly clear to us. And there are pitfalls in trying to use pass by reference, because in case there is aliasing, certain methods or functions that you wrote, assuming that there won't be aliases in the input parameters, may fall apart. So, you need to be careful about this. Now, what happens when you write main, and from main, you call fun one. For simplicity, let's consider pass by value for the moment. So, just before executing the implementation of fun one, the computer has to note down in a special area of RAM, what to do after fun one returns. And because the same fun one may be called from different sites, what you have to remember to do after the return changes, depending on what site you call from. So, you can't do it in just one global place statically. You have to do it dynamically depending on where you're calling from. So, the data structure used to store this pending work information is called a stack. And in real life, a stack is nothing foreign. Suppose I'm reading a paper with a student, when the door opens, and office stuff comes in to get some papers signed. So, I probably mark exactly where in the paper I was reading, maybe with a marker or with a pencil. Then I stack the paper that the office assistant brought in on top of the paper I was reading. I start signing it. At this point, the phone rings, and I suspend signing of the papers to answer the phone call. So, I listen, I talk, and I have to remember how many signatures I've done, and how many more I've got to do from where. So, I put down the phone. I remember what I was doing with the paper I have to sign. I finish signing and dating all the places, and then I give it back to the office assistant. Then that paper is popped from the stack. So, you push on the stack, you pop from the stack. I pop that paper from the stack, and then I realize that I was reading a paper with a student. Then I pick up where we suspended reading that paper, and we continue. So, this paradigm is called last in first out. It's not particularly fair. It's not like a queue where people get serviced in the order in which they arrived. It's the opposite of that. So, you get serviced and you're out of the system if you are the latest person to arrive. And that's the correct semantics for function call. Main calls fun one, main is interrupted fun one starts. Fun one calls fun two. Fun one is interrupted fun two starts. Exactly like that. And the return is similar. So, this is what a stack looks like in real life, and that's sort of my life. Now, the question is how to implement all this in a computer. And one of the key devices required for implementing and using along with the stack is the so-called program counter or the PC. So, the PC is a special register that you are given the C++ compiler and runtime system cannot directly access. There is very restricted access to a PC. The default action of a CPU is to look at the PC, fetch the instruction from wherever the PC is pointing in RAM, execute that instruction and increment PC by one. If there's no special instruction to change the PC, that's what the CPU does forever and ever. As soon as you start up the PC, your personal computer, not this PC, the program counter. The CPU boots up. And after that, it enters into this infinite action of fetching an instruction from PC, executing it, incrementing the PC. That's what it does. The only times at which the PC is changed other than incrementing is when you do control statements, like if then else or while loops. Then the PC is changed in more interesting ways. For example, here, depending on whether centigrade was less than five or not, PC may be changed from five to six or five to seven or five to eight. Seven has really nothing on it. So that's how the program counter guides what the computer has to do next. In case of loops, it's similar. I start off with the PC being equal to one, and I sequentially clock it up until I reach four when, depending on the condition of the for loop, the PC may loop between three and four alternately, checking the condition and executing the body until it exits to six. This is a rough statement. In the actual computer, each executable piece of code is much simpler than this. You can't do such complex operations in one operation. So when your program loads up in any modern operating system, it's given three memory segments. One is called the code segment. That is the one that stores compiled executable code. So this is binary code, as in a.out. Your program is also given a data segment, which is also known as the heap, because it looks like a really disorganized noodle of stuff, real jumble. And it is used for storing things like strings, vectors, and matrices that we have been looking at so far. We'll discuss heaps in more detail later when we talk about dynamic allocation and new and delete pointers. The third segment, which is what we will focus on today, is the stack segment. This is memory used for communicating between collars and collies, and memory used for keeping track of pending work once the collie returns. That's the purpose of the stack segment. So when a.out is loaded into memory for execution, these three memory segments are separately allocated and given to your process to use. And this may now explain why if you do illegal access to memory, the message that's printed out is segmentation violation. It means that you went out of the data or stack segment and read the code segment or access the stack segment in an illegal manner. So the stack is actually a region of RAM. Suppose it starts at address 4,000, arbitrarily chosen. That's called the base of the stack segment. There is a special register called the stack pointer, or sp, which starts out pointing to that base, so 4,000, which signifies that the stack is empty. You have not done any calls yet. Now, suppose you have a routine main, a function main, which then calls fun1, which then calls fun2. When main calls fun1, something called an activation record will be pushed onto the stack. The stack pointer will increase from the base of the stack to just above the activation record. That activation record will study what it looks like roughly right next. Manages the communication between main and fun1, and also allows fun1 to correctly return control to main. Now at this stage, I'm executing code inside fun1 when fun1 decides to call fun2. What happens is, another activation record, which manages the communication between fun1 and fun2, is pushed on to the lower record, on top of the lower record, and stack pointer increases. Now the general rule in calls and returns is that you should never need to access an activation record that is below the topmost one. You're only interested in the topmost activation record because your immediate return only affects the topmost record. That's why it's a last in forced out order. So the stack pointer grows, and suppose fun2 does not call anything else, it's self-complete. It normally completes its execution and returns to fun1. At the end of that process, stack pointer will decrease, and that record will logically disappear. And now control has come back to fun1. Say fun1 doesn't call anything more any further. It returns normally to main, at which point stack pointer comes back to stack base, and main is resumed. That's how the stack is managed. Now let's see what the activation record looks like. So first we'll look at the design of the callee, and then we'll look at how the caller invokes the callee. So your high level function, the callee, is the absolute function. It says taken input parameter a, which is an int, and then return if a is positive just a, otherwise minus a as the output. So the gray box at the top shows the high level code. The activation record is a region of memory with the following fields. It has a field called in, which will hold the value of a on entry into the function. If the function changes a, then that will be reflected only in the in field. It's like another cell of memory. Then there's another named field called out. When abs returns, whatever value it's trying to return, it will write into that slot called out. This is the space to write the return value to be consumed by the caller, whoever the caller might be. And finally, there is a slot to store the code address to jump to when execution of abs completes. So suppose in memory, the compiled low level code for abs is organized from memory address 2000 onwards. So somewhere there's a mapping table which says abs is equivalent to code address 2000. At that address, there's an instruction saying a equal to stack.top.in. I've not discussed the dot notation yet, but intuitively it says use the stack pointer to access the topmost activation record in the stack and read its field called in, which the caller is supposed to have filled in with a value. Then instruction at address 2001 says if a is equal to 0, then bypass the instruction at 2002 and go directly to the instruction at 2003. At address 2002, you just negate a. a is like a register or a is a register. At 2003, you have basically finished what you had to do in the abs function. So you say, well, I still have this stack pointer, sp, which gives me the top of the stack. In there, now write the value of a into the field called out, because I'm passing out that value to the caller. And finally, whatever the caller passed me as the to-do, where I have to continue execution, change PC to the new value stack.top.to-do. That instantly means that the next instruction to be executed will come from whatever that address to do was. I will not drop down below 2004 and go to 2005. Instead, I will update PC to whatever address the caller passed in. This will become clearer once you look at the design of the caller, which is more complicated because there are two calls to absolute. So the code for the caller, you might get a click in the neck, but that's the only way I could fit it. So main initializes x to minus 3 and y to minus 5. Then it says int z equals abs x minus abs y, and then finally int v equals z times 2. Just sample code. There are two calls to abs. And that's how the code is organized. Suppose main is compiled into executable code starting at code address 1,000. So there are some mapping table which says that main is equivalent to starting at 1,000. So at 1,000, you set some register maybe x to equal to minus 3, or maybe a memory cell to minus 3. At 1,001, you set y equal to 5. In instruction at address 1,002, you prepare to call abs for the first time. The preparation for calls is shown in yellow. The actual call step is shown in pink. And what you do to clean up after a call is shown in greenish. Every call consists of a pollute, a call and an epilogue. So what you do to start the call in instruction at address 1,002 is to push an activation record on the stack. This automatically updates the SP. In that activation stack, you set in to the value of x. So this is where you make a copy of minus 3 onto the activation record. And then you say, and you leave an empty slot for out that the call Lee would fill. And finally, you say stack.top.toDo is 1,005. So you fill in this return address of 1,005, which is the beginning of the cleanup code. And then you drop to the next instruction, 1,004, where you just change PC to 2,000, because that is the start address of the abs routine. Now if you test what happened in abs, stack.top.toDo is 1,005, so control returns to 1,005. At that point, you record stack.top.out whatever abs returned in a register R1. And then you pop the activation record from the stack which decreases SP and the record is gone. The stack is now empty in this particular case. In this particular program, after evaluating abs x, the very next thing you have to do is to evaluate abs y. So there's nothing in between. You immediately start preparations for the second call, which again pushes an activation record on stack. This is logically a new activation record. The space is reused in the stack, but it doesn't matter. And this time, stack.top.in is initialized to y, which happens to be 5. And this time, the stack.top.toDo is not the same as before. You have to return to address 10,10, not 10,05. This is a different call site, so the return will happen to a different place. And then the line 1,009 says go to abs equal to 2,000. This is the same address where abs starts. Perfectly clear what's going on. So again, this time abs runs exactly the same as before, except the return happens to a different place. The return now happens to 10,10, at which point we save stack.top.out to a different register, R2. So now R1 holds abs x, and R2 holds abs y. And I pop the activation record from stack and discard it. SP returns to the stack base, and there is no pending work to do, except for what is in here. So after that, I have z equal to R1 minus R2, and finally b equal to z times 2. So that's how the code is executed. So everyone clear with how function calls work? Show of hands, please. Quick show of hands, just to make sure. So it's fairly simple. And the stack makes sure that no matter how deep you nest your calls, as long as there is space in the stack segment, you can keep calling and returning. It doesn't matter. So through the execution of your code, the stack segment grows and shrinks, grows and shrinks, as you nest function calls. And in the end, the stack will become empty again. The other thing that the activation record has to do, which I haven't mentioned, is to save and restore register values. Every call involves suspending the caller, executing the callee, and then resuming the caller. Remember that the CPU has a fixed small set of registers, typically 32 or 64 registers. If the body of the function is fairly large and complicated, you're pretty much juggling between those 64 registers to get your work done, moving things from RAM, pushing things back to RAM, and so on. Now these registers are like global variables. The caller was using them. Now if the callee starts using them without regard to what the caller was doing with them, then the callee will clobber the values in the registers. You can't tolerate that. The caller doesn't know what happened. Suddenly register values change. Everything will become corrupt. So on the activation record, we also have to allocate space to save all the registers before the call. And after the call, before the caller resumes, as if nothing happened, we have to ensure that nothing happened by copying back the saved register values from the activation record back to the registers. So the CPU provides hardware-assisted, fast instructions for doing all of this. So you could tell the CPU, here's an activation record, save all registers. And that will happen very fast in a few clock cycles. Then you can say push something on the activation stack, save the PC. All these have single instructions to do. And so in spite of that, a call and a return is a substantial amount of complication, as you have seen. There's a lot of bookkeeping, allocation of storage, writing things, saving things, reading it out, putting it in registers. So these are all the overheads of function calls. So generally speaking, if you write function calls whose bodies are extremely small, then the inefficiency of the function call itself will start becoming significant. If your function call body is very large, then it doesn't matter. So there's one mechanism provided in C and C++ called the inline function call. Where the compiler tries to understand that the body of the function is very lightweight, and it avoids that explicit call through the stack mechanism. It tries to short that out by inlining the body of the function at the call site. But that makes the code larger, because you're now copying one implementation in multiple call sites. So there's a shade of, whether you want to make your code larger, or you want to incur the overhead of the function call. That's something to be sensitive to. If you're doing very high performance coding. So now we will get into separate compilation units. So thus far, we have placed all our code in one C++ source file. In fact, in the beginning, we started writing our entire code in one main function, and that got really ugly. So we started splitting up things into multiple functions. For example, we wrote a function to print matrices. Then we wrote other code to find determinants to do Gaussian elimination, to find inverses. And for debugging and understanding that code, we kept on calling the function we had written to print matrices. Now remember, at some point, I was writing Gaussian elimination. I had a matrix printing routine in that. And then when I started writing determinants, I copied that function over to the other file, so that I could also print while finding determinants. Now that's a loss, because I wrote the code in two different places. Maybe one of them has a bug. I fixed that. I have to fix the other guy also. The compiler has to keep compiling two different copies of the same code. It might have done it just once. And finally, I can't easily share the matrix printing utility I wrote from multiple projects. So it makes much more sense to separate out key functions like that, which can be reused many times from many places, and export it so that it can be used from multiple projects. So that is done by writing two files. One is the .HPP file, or H file, doesn't matter. So matrix printer .H, or HPP, provides a template or a signature of the function which tells potential users how to call it. The header file .HPP does not specify how matrix printing is implemented. The implementation itself is done in a file called matrix printer .CPP. That's a normal C++ file. Now, suppose I had these two other projects, Gaussian elimination, Gaussian .CPP, and matrix inversion .CPP. Those will now include matrix printer .HPP. And now it can use any function that's defined inside that header file. So the best way to understand this is by looking at code. So let me code up all of this. I will not actually write code for printing the matrices. There will be empty functions just to show how the interfacing is done. Here is the header file. I want to declare a print routine called CS11 print. It doesn't return anything. It just prints a matrix called mat, a matrix of doubles. Because the printing routine does not change the matrix in any way, it's best to pass it by reference and declare it a const so that any caller can be reassured that the routine won't be changing it. Now, because I'm using boost matrix class, I have to include the boost library in this header file. And then in this case, I switch the name space to boost numeric ublas so that matrix becomes visible. Otherwise, matrix won't be directly visible. The other option is to not go into a name space, but say boost numeric ublas matrix. So the following would be pretty much equivalent instead of the name space declaration. But for simplicity, let me switch the name space. It doesn't hurt. Same student, no change. So matrix printed.hpp does not give an implementation. Observe that it's a declaration of the function where the curly bracket would have opened. There is nothing but a semicolon. So it's just a declaration. Meanwhile, matrix printed.cpp gives the implementation. Matrix printed.cpp, of course, has to include iostream, because we'll use that for the actual printing. And it also has to include the template that I've declared. And then it shifts name space again, although strictly speaking, that's not necessary because the header file already does that for you. And then this is the actual definition of CS101 print. For simplicity in this description, I'm not actually bothering to print anything. I'm just printing out hello. So matrix printed.cpp and matrix printed.hpp together form a module that can now be reused by other people. And preferably, your HPP file should have comments to say what happens to the input parameters and output parameters and so on. Technically, the caller or the client should not need to read the cpp file at all. They should only read the HPP file and other manual pages and understand what your package is doing. Now the next step is to pre-compile matrix printed.cpp into binary code. Now matrix printed.cpp does not have a main method. It's not something that you can invoke from the command line. Only C++ files, which have a main function, can be called directly from the shell. This doesn't have it. It just provides one function called CS101 print. To compile that, you have to say g++ dash c matrix printer.cpp. So dash c says compile only. Don't try to make it into an executable code because you can't. There is no main function. So if I do that, it will in some time finish the compilation. And what was created was this file called matrix printer.o. This is an object file. It's not quite executable. It's something that can be linked with other executable things and called. What does it have? It's a compiled version of matrix printer.cpp. So it cannot really be printed. It'll protest that it might be garbage. But if you demand to see it anywhere, it'll be all kinds of junk, which is not readable. But it's a codified form of matrix printer.cpp. In fact, if I do strings on matrix printer.o, you find that our hello was embedded somewhere there because I was printing it out. Now at this point, your output or the output of your library is matrix printer.o and matrix printer.hpp. You should never need to compile matrix printer.cpp again. You have done it once and for all. Now let's look at a customer, which is colr1.cpp. colr1.cpp is very simple. It just includes matrix printer.hpp. It switches namespace. Again, that's redundant. But in some cases, you'll need it. And then main, it allocates a matrix. I'm not even bothering to initialize it because CS101 print actually doesn't use the matrix. But you can. And then it calls CS101 print on a. So observe that in this particular file, there is no implementation of CS101 print. The information about the signature of CS101 print comes from that header file. So if you want, you can, again, just compile only colr1.cpp. And colr1 has an unresolved reference. If you try to do the final compilation of colr1.o, then gpluswas will complain that there is an undefined reference to CS101 print that it cannot find. So to actually make this work, you have to also give the matrix printer.o file as input to gplusplus. Now it can bind the two together into an executable file. So now it's very fast because those object files are all pre-digested, half-digested. It just combines them into the executable. So now a.out has been prepared. And now if you run a.out, it prints hello, which is what it's supposed to do. So I've now separated out the two object files. And the other thing you could do, which some of you may have explored already, is that when you do this final linking, you can say that I want to output to a different executable called colr1.exe. So instead of a.out, colr1.exe will be created. That's the file. It's exactly the same file as a.out, actually. So we learnt about two flags, dash c, which does the preliminary compilation of your source, but doesn't prepare it for execution. And now we can take multiple .o files and pull them together into one executable file with one main function somewhere. This mechanism is totally clear? Any questions on this so far? So of course, nothing prevents you from writing a second program called colr2.cpp and using CS11 print from colr2, colr3, and so on. No multiple compilations or copies are made of the CS11 print function itself. It is reused from one .o file. So we'll come back to make in a moment. Let's first look at the image processing example. It's actually the same style here. Someone has written a header file called easybmp.h and correspondingly a easybmp.cpp. Easybmp is a package which lets C++ programs read bitmap images. So some of you have seen that you can download these files with images or your camera text pictures and saves them as .bmp files. These are bitmap images. And easybmp is a library or a package which lets you read an image and write an image from in-memory arrays. Effectively, easybmp turns images into three in-memory pixel matrices with red, green, and blue intensities. So as before, we will compile easybmp.cpp to easybmp.o ahead of time and once. And then we have these two applications. One is ag.cpp, which does edge detection. And the other is enhance.cpp, which enhances color contrast. And each of them will reuse easybmp.o. Now for those of you who are not totally familiar with how images work, here is a very small refresher. If you look at any bitmap image, very close up, highly expanded, you'll know that they are made up of discrete pixels. By your digital camera, there are so many megapixels. A pixel is a uniform area of color and intensity. So a picture is a 2D array of pixels. Each pixel has three intensity values. One is the red light intensity, the green light intensity, and the blue light intensity. These three are the primary colors. When mixed in suitable proportions, you can get any other color. Now an intensity is one byte. An intensity is an unsigned byte, which records a level of light between 0 and 255. 0 is darkest. That is completely black. And 255 is the brightest. The brightest possible red, the brightest possible blue, and the brightest possible yellow. So this is called 24-bit color, because each primary color takes 8 bits. And R and GNB, concatenated, is your color ID, if you will. So EZBMP takes this apart and makes it into 3D matrices effectively. So now you can manipulate those. So the first application we'll look at is object identification, or a starting point of which is edge detection. So babies can identify hundreds of objects by age one, although they can't talk about them. And it's known that babies don't really get colors that easily. They don't really distinguish between red and green, even when they're one year old, or even one and a half or two years old. They identify objects by means of boundaries. And what's a boundary? A boundary or an edge is where the color changes significantly between neighboring pixels. This is a working definition. It's not a perfect definition. You could have textured, polished wood, where there are color changes. Those are not edges. This is an edge. In principle, you can have this wood color with an edge and the desk behind, which are exactly the same color. You might have trouble understanding that there's an edge there, depending on the lighting situation. I can arrange to light them so that the edge is hardly visible. So sudden change in color is only a clue. It's not the final arbiter of whether there's an edge or not. And there are very, very sophisticated computation algorithms that I'll briefly mention at the end of this segment that can detect in a very robust way whether some pixel is an edge pixel or not. But as a starting approximation, we can say that if you one pixel is very different from the average of your neighbors, then you may be an edge pixel. That's one strong clue. So that's object identification. How do we code this up? Now throughout, we'll pretend that the image has split up into three different color matrices. And in today's very simplistic image processing algorithm, we'll deal with the three colors separately, logically. Most obstacle algorithms will exploit correlations between multiple colors. But roughly speaking, this is what we do in edge detection. Suppose I am at a pixel ij. I can find a weighted average of my neighboring pixels. And I could either do exactly neighboring pixels just distance one away, or I could do up to some distance of width away. So the total size of the square would be twice width plus 1. In this case, width is 2. You could either have an unweighted average of the neighboring pixel excluding yourself, or you could take a weighted average where you fade out the weighting of distant pixels. There are all these tricks that people play. And if the difference between the weighted average and yourself is large, then you flag yourself as likely to be an edge pixel. Now how do you flag yourself to be an edge pixel? You can say, in a different image, I'll set my intensity to be 255. Whereas if the difference is not large, then I'm not likely to be an edge pixel, and I'll set my brightness to zero. That's one way in which you can print out whether you found an edge pixel or not. So let's see how this works in code. So we'll skip past filter for the moment. So here are the parameters. To main, argv0 is, of course, the name of the program itself. Argv1 is the input image file path. Argv2 is the output image file, which should have only the edge pixels highlighted in them. And argv3 is half the width of the window. For example, in this slide, width is 2. And the window goes from ij2, i plus minus 2, j plus minus 2. And finally, there's this argv4, which is the edge threshold. It says that if I differ from my neighbor by more than threshold, then I'm likely to be an edge pixel. So these are things you can tune or you can also find by statistical mechanisms. So how is threshold used? If I give you two pixels and I give you a threshold, if the absolute difference between the two pixels is larger than threshold, then you return 255. Otherwise, you return 0. That's what filter does. Now, I read in the arguments as I need. Argv3 and 4 turn them from alphabets to integers. This comes from standard lib. And then I load a BMP input image. Who implements BMP? It's implemented by easybmp.h. Just like matrixprinter.hpp, easybmp hides all the details of how reading and writing of BMP images is done. You don't need to understand any of that. For now, you don't even need to understand this one. I'm just checking if the file type is BMP or not. I read in the input image, read from file. Which file, argv1? And then I also create an output image. The size of the output image is set to be the same size as the input image. The input image has two functions called tell width and tell height. And I pass them into set size of the output image. I also set the bit depth of the output image to 24, signifying that it's a three times eight color scheme. But the real work starts here. So the outer loop has the ij in it. So I'm going over pixels from i, j. Now, because I will be using this stencil pattern, i plus minus width to j plus minus width, I don't want to fall off the edge of the image. So I start the i scan only from width. So that i minus width is well defined. And I finish the scan at the image width minus width. Similarly, j goes not from 0 to tell height, but it goes from width to tell height minus width. So that stencil can move around without jumping off the edge of the image. And inside, I have another loop which is the delta i and delta j. So di and dj go over minus width to plus width. That's what does the averaging within the neighborhood. And this is a case of simple unweighted averaging. The averages will be stored in these variables called red pixel ij, green pixel ij, and blue pixel ij. So I maintain three averages. I start them off with 0. And then inside, if di is 0 and dj is 0, that's myself. So I'm going to ignore that particular setting. So I'm not taking myself while doing the averaging. So di is minus 2 to plus 2. dj goes from minus 2 to plus 2. When di and dj are both 0s, I'm at ij. So I omit that. And then rp ij plus equal to in image dot red pixel i plus di j plus dj. So easy BMP for any image provides three functions. The red pixel intensity, the green pixel intensity, and the blue pixel intensity at a given pixel coordinate. The coordinate has to be i plus di j plus dj for all of them. People are comfortably tracking this, not too fast. So similarly, green pixel ij adds on the green pixel, and blue pixel ij adds on the blue pixel. After this, I have to average them. So rp ij is divided by the number of neighbors. Green pij is divided by the number of neighbors, and similarly. Now in the output image, I set the red pixel to be filter of myself, the average over my neighbors, and threshold. We have already described what filter does. Filter compares between your value, the average of your neighbors, and if the difference is more than threshold, it sets the output red pixel to 255. If the difference is small, it sets the red pixel to zero. And similarly, I do that with all the three colors. And then I write the output image to another different file. So how many people are clear about, at least in principle, what the code does? Of course, you are not familiar with easy BMP, but you can do that offline. The basic point of easy BMP is load an image file into RAM when it becomes three matrices. And now I can access the red matrix ij, blue matrix ij, and green matrix ij. That's all. And I can write back and save it to a file. Comfortable with that? So let's run this and see what happens. So easy BMP has a whole bunch of files. You don't need to worry about most of them. So I've already compiled easy BMP.cpp into easy BMP.o. That's already done. So now I'm going to say g++, ag.cpp, easy BMP.o. You can also mix. I want to compile cpp now, but that o is already prepared. And I say dash o, ag.exe. So ag.exe is now saved. And remember, ag.exe requires four input arguments, input file, output file, and two parameters. So I'm going to run ag.exe on an input image called total.bmp. And I'm going to save the edges to a new bitmap image called edges.bmp. I'm going to average with width equal to 3. So the size of the square would be 7. 3 on the left, 3 on the right. And my threshold for deciding if I'm on an edge or not will be 5. If my pixel value differs from the average of my neighbors by more than 5 out of 255, then I'm likely to be an edge pixel. So before running this, let's look at what total.bmp looks like. I can do that with the EOG command on Linux. So this is the total.bmp image. You can basically hardly see anything. There are some bad pixels from the projector. So you can see a few white specks, but that's a defect in the projector. But let me douse the lights, move it in. It's a total made out of towels in a posh hotel or something. Now once I run the edgy command on turtle and save to edges.bmp with those stencils, now let's look at the edge file. So here is the set of edges that are found in the image. So you can see an outline of the turtle appear. It's noisy, but at least it's clear that there's a turtle like object here. And here is the superimposed original. The original is actually fairly hard to see. And yet the image processing algorithm correctly detected most of the prominent edges present in the image. Now this is for threshold equal to 5. And you see that there are these specks over the back of the turtle because the towel is actually a textured surface. So you can try to kill that. So this is what happened with threshold equal to 5. If instead I run this with, say, a higher threshold of, say, 9, now if I, so you see that the speckles have reduced a lot, but so have legitimate edges. So there's a trade-off between picking up legitimate edges and getting swamped by noisy textured specks. And there's a lot of computer vision and graphics literature which tries to get the best of both worlds. For example, one of the important tricks used to denoise this image is to say that you're trying to answer if this pixel is an edge pixel or not. And if you are on a speckled towel, then the speckling pattern is random. Whereas if this is actually an edge pixel, chances are high that two adjacent neighbors will also be edge pixels. And edge is an edge because it is linear in nature. Curves slowly at the scale of pixels. So you might say that I prefer patterns of edges which look like some of these. I prefer to flag neighbors and myself as edges together or not at all. In particular, if in the end you're going to say this is not an edge and neither of the neighbors are edge pixels, I am suddenly an edge pixel out in the blue, then by definition this can't be an edge. It's a speck. So there are elaborate algorithms called Markov random fields which are used to smooth out these mistakes. And flag pixels correctly as edge and non-edge. It's a very important area because identifying edges accurately is the first step to identifying regions. And identifying regions is the first step toward identifying object boundaries. And that's sort of the beginning on a long path to computation, which is still not a solved problem by any means. But today, computation can work real time on camera feeds. And they can identify that one person is walking toward another person, and they have handed a package to the other person. We can track this in real time. That, of course, requires a huge amount of computation. But one thing you observe is that at the lowest level, the image processing applications are extremely regular in nature. They involve extremely regular local processing of array of pixels, or a matrix of pixels. So that's why in computer vision and graphics, they build special purpose processors, which can do these kinds of matrix operations very, very fast, much faster than your general purpose CPU can do. And those are graphics chips, special chips, which are used to drive your display or to process camera input. That's special purpose hardware, which can do these things much faster. Any questions so far? So that's how you do edge detection. Second example is contrast enhancement. So the template is, again, very, very similar. I'll define what balance is in a moment. Now, what is a poor contrast image? We already saw an example of that. Total.bmp is a poor contrast image. You can hardly see what's going on here. How do you find out what the distribution of pixel intensity is? You can pass it through a tool called GIMP in Linux. So GIMP will take a couple of seconds to fire up. So GIMP shows the same image, but it can do various operations on it. For example, I can look at certain levels of color. So this shows you what's called a color histogram. Now, suppose I want to look at the red histogram. That means GIMP is looking only at the red matrix. And the x-axis in this is the intensity of a pixel. And the y-axis is the number of pixels with that intensity. So this is 0. That is 255. Now what this tells you is that it's a darkish image in the red spectrum. There are very few pixels which are bright. Most of the pixels have intensities between 0 and about, I don't know, say 200 or 180, say 180. Now, the story is similar. There's hardly any blue in the image. So if I go to blue, you see that it's even more squished to the left. The blue value of most pixels is between 0 and 50, whereas there's essentially no blue pixel which is brighter. Similarly, there isn't much green in the image, but it's somewhere in the middle. So there's a little bit more green. But most green pixels are between 0 and 100 in magnitude, in intensity. And there are hardly any brighter green pixels. And when I say value, that's like an aggregate of the three. This also shows that it's a dark image. So one of the things you could do is to try to correct it through GIMP. And you do that through the curves menu. So here again is that same histogram, shown a little stretched. Most pixels are dark, few pixels are bright. That's the pixel intensity. That is the density of pixels with that intensity. This line is called the transfer function. The transfer function is now linear. What you see is exactly what was in the file. Suppose I decide that there are so few high intensity pixels that I'm going to shift the transfer function to have a slope of 2 instead of a slope of 1. And any pixel which is at least, say, 200 to 50, any pixel that is this bright, I'm going to push it to 255. How do you do that? I can shift that to the left. So the transfer function now has changed so that its gradient is more than 1. So earlier, a pixel which was at intensity 50 would show up as 50. Now it may show up as 100. So a pixel with intensity 128 will now saturate to 255. And any pixel that's brighter than that will be clipped at 255. You can see that the image has already improved a fair bit. You can also make the transfer function nonlinear. So GIMP is much more sophisticated than the simple 60 line code we will write. In particular, you can make your transfer function nonlinear, thus improving contrast. That's a way to improve the contrast of the image. Today, we'll see a very elementary form of that through our C++ code. Now if I save the image, I have already done that. If I save this image, GIMP also has a way to automatically correct the image. So this is the auto correction. If you click auto, that's what GIMP does. It's much better than us in automatically finding a good transfer function. This now looks sort of like a towel, although the projector is messing it up a little bit. Now the one I saved is called enhanced.bmp, or GIMP turtle. That's what GIMP did for us. This is after GIMP did its color correction. I want to show another aspect of GIMP turtle. So this corrected image, suppose you want to check the histogram of color for the corrected image. So we go to levels again. And this time what do you see? You see that the color histograms are much more level. So what GIMP has done is stretch out the concentrated color histogram into something that covers the whole span between 0 and 255 uniformly. And that's true for most of the spectrum. Green is also equally smeared, and blue is equally smeared. So think of it as GE or GRD or get percentiles. The absolute marks, depending on how hard an exam was in a given year, may be bunched around a few values. But by reporting the result of the exam as a percentile, we stretch out the spectrum. So only the rank matters, not the absolute marks you got. So what GIMP is doing is basically finding the rank of brightness of a pixel, and then equally smearing the intensity over the ranks. Fine? So that is exactly what the routine balance does. It inputs the color histogram, and it creates a transfer function, which is that line. It automatically warps that line so that after the warping, the distribution of marks becomes uniform, between 0 and 255. Everyone clear? At least in principle, what balance is doing? I don't have time to go into balance in detail in the class. You can read the function offline. It just transforms the marks through that curve that we did by hand so that after transformation, people's marks become uniformly distributed between 0 and 255 in all the three colors. Other than that, it's a very simple routine. I again open up BMP image. I read it from a file. In the first pass, I just build color histograms. What's a color histogram? It's a vector of integers. It's domain, the indexes, can range between 0 and 255. So easy BMP byte is a byte. They just defined it on their own because on different architectures, it could be different. And so it's just between 0 and 255. And inside each histogram, it's a count of how many pixels had that brightness or intensity. So once I can read a red pixel, I read the red pixel at i and j. I use the value of the red pixel, which is between 0 and 255 as an index into the red histogram. And I increase that to say that the number of pixels with intensity red pixel ij has increased by 1, because I just saw that pixel. Similarly, with the green histogram and the blue histogram, it's like counting how many people got 43 out of 100 in class. Once I form the color histograms, I actually do the balancing. So balance takes as input a histogram and outputs a transfer function. That's exactly what Jim was doing. It showed you the histogram, which was initially compressed to the left. And it also showed you that transfer function, which you could force by hand. This one is automated. The balance routine inputs a histogram and outputs what it thinks is a good transfer function from input pixels to output pixels, intensity. And finally, in the second pass, now that the transfer function has been calculated, I just scan over the image again, the i and j nested loops. I read the red pixel, original one. I transferred it through the transfer function, red transfer. And I write the red pixel to the transferred intensity. So I correct the intensity, and I write it back into the same place. So this red pixel and green pixel and blue pixel ij, they're exactly like boost matrices. There's no difference. So once I write it back, I can save that transformed image to a different file. So the input to main is just an input file and an output file. There are no tuning parameters so far. So let's compile this one. So in this case, I'm going to compile enhance.cpp with easybmp.o and save it in enhance.exe. So it's compiled. Now I say enhance.exe. Input is again total. Output is enhance.bmp. It's pretty fast. Now if I do yog of enhance and total, let's see how will I compare? This is the enhanced image. That is the original image. Yeah? So you have to define the function by a function. It's defined at the end of the same file. So you can check what I do. There's some small formulas that I don't want to get into right now. Is this fine? So balance turns the color histogram into that transfer function. And then in the second loop, I transform pixel intensities from their original values, which is low contrast, to the final value through the transfer functions, red expert, green expert, and blue expert. And the result, as you can see, is this image, which has part better contrast than the original image. In fact, if I use jim to load up our home industry enhancement, jimponenhanced.bmp, you'll see that we're not doing badly at all. Compared to jim's own professional work, if you go to levels, you'll again see that the values of, there's some concentration. But overall, it's not too bad. I have smeared out the range compared to the input images. Again, there is no single algorithm for doing best enhancement. You can actually enhance images with lots of algorithms. That's also a very, very active area of research, motivated strongly by medical imaging. So an image can be captured under less than ideal circumstances, because the scene lighting was not good. For example, in that turtle scene, or the camera, CCDs, the charge couple devices that sense the light are not calibrated with regard to the room lighting. Or it could be because you're doing medical imaging, and the patient's tissue density and water content in the tissues is not ideal for the X-ray or the MRI imaging that you're doing. Some of you might know that for some of the MRI and FMRI imaging, you have to inject the patient with a coloring agent, so that the coloring agent diffuses in their arteries and veins and tissues. And that is then deflected or absorbed by those deflected or absorbed the X-rays. And that leads to better contrast images. It's called a contrast injection. Now, after this, it is vital to use statistical techniques to correct the image. A surgeon who is looking at the left image, which may be the native image as captured by the X-ray machine, may not pinpoint where the tumor is that the surgeon is about to operate on. Whereas with good statistical techniques for doing contrast enhancement, you may have the image as in the right-hand side, which would correctly reflect tumors or other growths that the surgeon is working on. So in medical imaging, image enhancement is a life and death job. It's really important. So it's a multi-billion dollar business. There are companies like Siemens and General Electric G, who spend billions of dollars designing their MRI machines and then writing algorithms to do contrast enhancement in exactly the right way to help surgeons. So we just today saw just the beginnings of how to do image processing, but there's a long way to go. So that's another example of where 2D matrices are very useful. So that finishes about our separate compilation. But now we'll start looking into how to use make to control these projects. So before you know it, you have split up your big project into hundreds or thousands of files. And so far, I was typing by hand g++, this file.cpp, that file.o, write the output to some other file. By the time your project has even a few dozen files, you have changed one file. So here's a picture of what depends on what. So remember, I wrote this library called matrix printer.hpp and matrix printer.cpp to go with it. Matrix printer.cpp and matrix printer.hpp were used to generate matrix printer.o. This was the library file, which you can then link with gaussian.o to form gaussian.exe. Earlier, we were using caller1.o and caller1.exe, but it's the same story. Now in this case, suppose I have modified gaussian.cpp. You should not need to recompile matrix printer.cpp. Conversely, on the other hand, if I have updated my matrix printer.cpp to do more pretty printing, and I don't need to recompile gaussian.o, but this path has to be redone. So there's this abstract sense of a directed acyclic graph of dependencies. This arrow tells you that gaussian.exe depends on gaussian.o and matrix printer.o. Anytime any one of those matrices change, I should recompile gaussian.exe. So anytime any node in this graph changes, I have to regenerate anything reachable from this node right up to the goal. The goal is called gaussian.exe. So this graph is codified into a file called a make file. The system itself provides no consistency check between the cpp file and the o file or the a.out file. If you updated matrix printer.cpp, the system will not interfere if you try to run a stale gaussian.exe. You have to write your own utility to do that. So let's set this up and see how this works. So in my directory, there's a lot of junk. So let me start removing some of this. So color1.exe, color1.o, all tilde files, in fact all .o files. So I'm left with only matrix printer.cpp, matrix printer.hpp, and color1.cpp. Those are the sources. Now let's try to express what my dependencies are. So those are the files. Repress them. That file is called a make file. You start writing a make file. The first rule you write in a make file is the root node. The root node is gaussian.exe. In our case, color1.exe. So you say that the root task is called all. Just like main was the conventional entry point of your c++ code, the task to finish in a make file is all. So you say you have done all. You have finished your job. If you can, give me color1.exe. So the target all is a pseudo target. There's no file called all. But you say that you have finished your job when you have produced a color1.exe. Color1.exe itself depends on color1.o and matrix printer.o. So the target is on the left. What the target depends on is on the right. If any of the right-hand side targets change, then this rule will be fired. And the action in the next line will be executed. The action, if you remember, is g++ dash o color.exe. This is my output file. And the things I want to compile are these two object files. That's the action. So to summarize, all is the root target in that graph, color1.exe. Color1.exe depends on color1.o and matrix printer.o. If any of them are modified, run the following command, g++ dash o color1.exe, those two object files. Now color1.o itself depends on color1.cpp. But it also depends on matrix printer.hpp. If I modify the interface of CS101 print, color1.o will be affected. And in this case, the job to execute is just that. Color1 does the including for us. So that doesn't appear in the make file. Similarly, the compilation of the library itself will look like matrixprinter.o depends on matrixprinter.cpp and matrixprinter.hpp. And the work to do in case either of these files change is to recompile matrixprinter.cpp. So this is my make file in the end. It says to finish the job, create something called color1.exe. Color1.exe needs to be created or recreated if any of these files change. If any of those files change, execute this command. So this, literally, is an encoding of this graph structure, together with the commands that are required to generate each of the nodes. So now, if I go back and run make, make by default, picks up make file. Make file is a specially named file which make looks for. So if I say make, that will look for make file. If a make file exists, it will start from the all task. It will check if the all task is fresh or not. So this time, it is not fresh. Nothing has been done. So if I run make, it first compiles color1.cpp, compiles matrixprinter.cpp, and finally generates the executable file. What is my directory now? See, color1.or has been generated, matrixprinter.or has been generated, and color1.exe has been generated. If I run make again now, it will say nothing to be done because it has found that all the dependencies are satisfied with the most recent versions of the files. How does make know? Suppose I now look at matrixprinter.cpp and say I change this to hello world. See, the .h file hasn't changed. So the interface hasn't changed. So color1.cpp should not need to be recompiled. Only matrixprinter.o has to be regenerated, and color1.exe has to be regenerated. Let's see if that works out correctly. If I run make now, see it understands that matrixprinter.cpp was modified by the timestamp on the file. It recompiles matrixprinter.cpp. That triggers the recreation of color1.exe. So this is a huge benefit because you don't need to keep in your mind what files you updated and who that affects. You encode it in the make file. And thereafter, whenever you change any file, the correct thing is done. So next time we'll see more examples of that. And then we'll go on to recursion.