 So, welcome to Computer Programming and Utilization CS 101. I am Soman Chakraborty and this course is being offered in cooperation with a lot of you know helpful TAs and stuff from Computer Science. First things first, that's the Lifeline website inside IIT. From there on you can go to a number of other places which are all important for the running of the course. This is what the page looks like a few hours back. So you can find the instructor contact which is not my personal email, places and times there are two sections, some typo in LCH32 not LHP, lectures, labs and tutorials. So as you might know all the students in this semester are divided into two sections. This is section 6, section 10 happened yesterday. Section 10 is taught Tuesdays and Fridays, section 6 is taught Wednesdays and Fridays. And there will be four laboratory batches, one for every evening, Tuesdays through Fridays. If you are here today and you belong to batch 6, section 6, you should take your roll number and find all the digits only in your roll number. Add up any Ds and other characters, take only digits 0 through 9, add them up, if the result is even then come to today's evening lab at 8.30 in old Computer Science Department, Math Department basement. If the result is odd then you should go to Friday. So that's the algorithm. So then we have tutorial slots one per week for every section. This week there won't be any tutorials because you are just getting started. The tutorials will begin from the next week and we'll see how much interest is there and what the attendance is and we'll decide on the volume and frequency accordingly. So on Moodle you can find announcements and discussions about the problems, assignments to be worked out on paper, lab assignment specifications including today's lab assignment and also all the solved problems turned in by the students, projects and so on. We will keep on Google Docs, students' sections and lab batch information, list of senior and junior TAs with their contacts and also student evaluation, meaning your marks and points in projects and homeworks, etc. On the static website with that URL you'll find lecture notes and slides, exam and solutions after they happen and other resources like textbooks, slides from other courses, links to similar courses around the world and so on as we go along. So that is the course homepage. On the homepage it's already appeared although you don't see it in this version of the course page. You'll find a survey link which tells us about your background in handling computers and programming, if any. You should fill out that survey either between now and between the time that you do your lab session tonight. It's preferable that actually all of you finish that survey within a day or two so that we can get to know your background better and tune the course accordingly. And you can also find out your lab batch by the algorithm I described, it's also given on the course page. And once you know where and when you have to be, then 80% of success is just showing up there at the right time. So here's the Gantt chart of sorts, lectures, two sections, slot 10, Tuesday Friday, 3.30, slot 6, Wednesday Friday 11 to 12.30, that's where we are. Labs in old software lab, that's in the basement, that's the first floor of the math department building. Four evenings a week, that's Tuesday through Friday plus one make up on Monday evening. So if you have lost some ground or you'd like to do something extra, you can land up there on Monday evening, there'll be someone to let you in. But there won't be any structured guidance or anything like that. So that's Monday and then people who do their lectures on Tuesdays, half of them will go to the lab on Tuesday and half of them will go on Thursday. And people who do their lectures on Wednesday and Fridays, that's you, half of you will come to lab today and half of you will go to lab on Fridays, okay, that's the plan. Rough strength is like 160 in each lecture section and then 80 in each lab section. So if the algorithm results in a substantial imbalance, we might need to push you around a little bit, but I don't think so. So here's the management, organization chart. So the instructors, email address is not my personal email address. Please use this one to get uniform responses because if you flood my standard mailbox with emails of 350 students, I just want to know where to start. So this is a better choice. So I'll be assisted by senior tiers who will manage the website, moodle assignments. And then there'll be one senior tier designated for each day of the week. So this slide is a little wrong. The make up lab is Thursday, okay. And then for under every senior tier, there'll be a bunch of junior tiers. You will have your own junior tier. So maybe approximately seven to nine of you will be assigned to one junior tier and then depending on the number, so maybe there'll be seven to eight junior tiers reporting to a senior tier for every day of the week. So we'll have one midterm and maybe a few quizzes that are indistinguishable from each other. So two to three quizzes in all, amounting to 25 to 35% of the total credit. The final exam will be between 30 and 40%. And some lab sessions or exercises will be graded, amounting to about 15 to 20%. And if we do a project in Viva, it'll be worth about 5 to 10%. So the challenge in a course like this is that you come from very diverse computing background. Some of you may not have even seen or touched a computer. Some of you have already coded C++ and you might run the risk of getting terribly bored. So the difficulty is in planning a course that keeps most of you reasonably employed most of the time. So we'll see what we can do. So just to get started, most of you have seen a motherboard before, right? This is the second semester in IIT, so you must have seen a motherboard somewhere. So as you can see, this is the CPU. It gets pretty hot, so it has a cooling fan. It has aluminum fins through which the air runs and keeps it cool. The other prominent thing you see is this cards with lots of chips on them, okay? Those are fast electronic memory, also called random access memory or RAM. And those are connected to the CPU with a lot of wires. So maybe up to 32 or 64 wires through which data can flow very, very fast. Also you see these plugs here for connecting cables to magnetic disk to store more data. So zooming out, there are these data cables which go from the motherboard to hard disks. So if you open up a hard disk, this is what you see. It has an old or something that looks like a CD but it's made of metal with a magnetic coating on it. On top of which, there is a magnetic arm called a read-write head. It's sort of like an old record player's arms, assuming you've seen any of those. And data is organized in circular tracks. So there are concentric tracks, not one spiral, but different circular tracks. And if you want to read a data item on the disk, the head has to swing in or out over the radius of the disk like so. And then the disk has to spin to bring the write data item under the head. And then the head can read or write that bit or bit of data. So typically the capacity of hard disk is very large compared to main memory. And hard disk is also much slower than main memory. Main memory is entirely electronic. All you have to say is, okay, here is an address. Main memory is organized in location. So it's location number 35 has some integer in it. So here's location number 35, tell me what is there. And that entire thing is done electronically. And it's called random access because you can give it an arbitrary address and read or write the contents of that address. Whereas disks are not random access because depending on what location you want to read, the head has to position itself mechanically. And the platter has to turn to position that bit under the head. So typically there is a lot of mechanical latency involved in accessing, reading or writing a bit on disk. To give you a rough idea, the CPU clocks nowadays at something like two gigahertz and you can do elementary operations like adding two integers in pretty much one, two, three cycles, okay? Whereas if you have to wait for data to go from the random access memory to the CPU, that may easily take between 10 and 20 cycles, okay? But that's still like in the order of gigahertz, billions of operations per second. Whereas to position the head on a disk on a particular part of the disk and read it, that may take up to 12 milliseconds. So during that time, the CPU can do a huge amount of work. And so software systems which work on these computers have to be designed very cleverly so that the differential speeds of the main memory and the hard disk and the CPU itself, the central processing unit, they don't keep each other waiting, okay? That they can each proceed on work which can be cleared off while waiting for others as little as possible. You also see these connectors at the back of the motherboard where we connect something like a monitor or a display, a keyboard and a mouse and so on. So there are input output devices. You communicate with the computer through these peripheral devices which are shown at the top of the screen, okay? So we have all seen, most of us are pretty familiar with this layout. But that's a physical layout. Can we build some sort of a logical abstraction of what a computer should look like to a program? So here's a simplified abstract view and I'm hiding a lot of detail here. To the left is the central processing unit or the CPU. The CPU has many parts, I'll hide most of them and just tell you the essentials for today. There is the arithmetic and logic unit or the ALU. And as the name implies, it's in charge of doing arithmetic operations and logical operations. So what's an arithmetic operation? Adding two integers, multiplying two integers. What's a logical operation? Taking two Boolean variables and finding their AND or OR or X or something like that. Checking a condition to decide what to do to proceed next. And the ALU is very, very fast, it's the fastest part of the whole system. It is assisted in storing data for those operations by a bank of registers. A register is a fixed-width block of bits. A bit is 0 or 1. Now in daily life, when you communicate numbers to each other, we use the base 10 system. And you might wonder why computers don't do the same, okay? Now if you wanted to use electronic circuits to represent digits between 0 and 9, then each digit would have to be represented by some voltage level, okay? Now if you wanted to represent 10 different voltage levels between 0 and 9, maybe we need up to 9 volts, 0 through 9 volts to represent that. The problem with that is that distinguishing between voltage levels that are very close by gets more and more difficult for the receiving end of the circuit, okay? In particular, if you want to reduce power consumption by the computer, which is always a nice goal, you want to reduce the operating voltage. In fact, the CPU core works in something like 3.3 volts. And the RAM itself gets 1.8 volts, okay? So at those tiny voltages, it's very easy to mess up your bits because someone switched on a vacuum cleaner or an AC just turned on and so on. So the surge in the line can leak through the wires into the circuitry and make a 7 look like a 8 or a 6, okay? To avoid that, you need the logic levels to be as supported as possible. And naturally, it's easiest to say that there are only two logic levels. It's either 0 or it's 1. And we create a substantial voltage difference between them, okay? Now as you might have known, the capacity of random access memory has also been steeply growing. Less than a decade back, you'd be lucky to have a personal computer with 128 megabytes of RAM. Today, it's very common to find personal computers with 8 gigabytes of RAM. Even laptops with 4 gigabytes of RAM. Part of the way that's done is by compacting the circuit to make it smaller and smaller, take less and less power than an area on the chip. And as a result, today, a bit in a computer memory, a zero one state, is stored with the help of only a few hundred electrons. It's very easy to disturb a few hundred electrons in the universe. And so it's best if we keep the logic levels widely separated. And that's what led to the evolution of computers. Electronic computers toward zero one binary system, okay? So these registers are then banks of bits, okay? Which may be in arbitrary configurations. And as we shall see in the course, integers, voting point numbers, lists, arrays, everything has to be encoded on top of this bit vector. The arithmetic and logic unit communicates very fast with this bank of registers. But the number of registers is very, very small. They're typically, say, 64 bits wide. And you may have 32 or 64 of those, that's all. That's the fastest memory you have. If you need anything outside that, then the CPU has to emit an address to the random access memory. Random access memory itself is another giant 2D matrix of bits, okay? It is 8 bits wide, and that's called a byte. And to the bottom, it can go up to, you know, gigabytes. Addresses are nothing but row indexes. So there's location 0, location 1, and it goes on like that. So when the CPU sends out an address, it's basically just a row index. And it can either tell the random access memory to write that particular location with a value taken from a register, or it can read the value of that location or flow and then store it into a register. So it's a bi-directional flow of information between these RAM locations and the register bank, okay? Now, part of the random access memory or RAM is reserved logically for the display and part of it is reserved for input devices like the keyboard. So when you hit a key on the keyboard through some magic, that results in overwriting some memory location that is designated for the keyboard. Similarly, if you want to print something to the screen, some part of the memory that's reserved for the display is written. And the display electronics picks it up and shows it on the LCD panel, okay? Now it's actually more complicated than that. You know, you can draw pictures, have graphics and games and so on. But we'll get into all those complexities a little later. For now, it's enough to say that there are parts of RAM which are reserved for input and output, and that's how things happen. So apart from all this, to do anything useful, like say add two numbers from RAM and store it back to some other RAM location, you need to write a recipe. It's like cooking, you know, baking a cake. So you have to mix flour, you have to put yeast, you have to put fruits, dried fruits, and then you have to put it in an oven and wait for 20 minutes or something, right? That's the recipe. So that recipe is a program or an algorithm. And that program which tells the CPU what to do and how to do it, turns out it's also stored in RAM itself. So the CPU can read the recipe from RAM and then execute it on other parts of the RAM to get your results, okay? So that's a rough idea of what a computer looks like on the abstract view. Now, so the rules of the game are as follows. A register or a RAM location can store exactly one value at a time. If the register is 64 bits wide, each bit can store only 0 or 1. And that pattern will be overwritten if you write something else to it. So writing into a register or RAM location or reading it in from the keyboard destroys the earlier value. The earlier value is not there anymore, it's volatile. The earlier value is gone. Whereas outputting a register or RAM location to the display is done by copying this into some memory reserved for the display, so nothing happens to the source. So you can copy, you can overwrite, okay? And the other approximate assumption we'll make is that accessing any RAM location is equally fast. Strictly speaking, that is not true. In fact, as I said before, RAM is slower than CPU. And in fact, this RAMs are what are called dynamic random access memory, which means that those few hundred electrons actually leak away, unless they're refreshed regularly. So there is circuitry which reads the RAM and writes it back, so that those electrons are not lost. So you have to pump in a little bit of charge so many million times every second, so that the memory stays in whatever state it was. Otherwise, you turn off power, you know that computer's memory is totally lost, okay? Now that goes on its own schedule, and the CPU may want some data from the RAM. You can't interfere with the refresh. If the refresh is delayed, then the memory will disappear. So refresh has priority, and then the CPU has to wait until the refresh is done, and then the location can be read. So RAM turns out to be quite a bit slow for modern CPUs. And something that we won't talk about right today is that between the CPU and the RAM itself are layers of what is called Cache, C-A-C-H-E, okay? Cache is also a volatile RAM, but they are faster than RAM, but still slightly slower than CPU. They have intermediate sizes, and they kind of buffer up the delay that the RAM would otherwise insert in the way of the CPU. But we won't talk about that. As an abstract clean view of the machine, this is good enough to get started. For now, we can assume that accessing any RAM location is equally fast. Now, let's say here is a sample recipe that has to go into the CPU to get the work done. So here is a program or a recipe. Suppose I want to convert a temperature given in Fahrenheit to a temperature given in centigrade. So we all know the conversion formula, which is given in the first line. Now, because it's painful to keep on saying register 5 or RAM location 43, I'll just give them symbolic names like r0 for register 0, and symbolic names for memory locations. So the first statement would be load input f from the keyboard into register 0, say. Then load the constant number 32 decimal into register 1, r1. Store r0 minus r1 in r2, load 9 in r3, divide r2 by r3 and store in r4. That gives you the right hand side. And then load the constant 5 in r5, multiply r5 by r4 and store in r6, and finally output r6 to the display. That would be the very low level coding that the CPU can understand directly. It's really spoon feeding it into the CPU, because the CPU can understand high level languages. So this is called assembly language programming. And as you might imagine, it's really tedious. For a very simple linear formula like this, you have to write 1, 2, 3, 4, 5, 6, 7, 8 lines of instruction. By the time you're writing a computational fluid dynamics code, it will get into trillions of lines. So we can't afford that. So this is painful. And ideally, we'd like to say something like input f from keyboard sets c to 5 times f minus 32 over 9 and outputs c to display. That may be the most we'd like to do. So the programmer need not keep a mental map of which register and which RAM location contain the values of whatever symbolic variables that the programmer has in mind. And there should be something like a compiler or an interpreter which will automatically translate code which looks like that into the manipulations involving registers and RAM locations that we saw in the previous slide. So in this class, we'll of course not look into how the compiler turns high level coding to low level code. That's the job of computer scientists. We will see how to write efficient and beautiful code in high level programming languages to do specific tasks. Even in high level programming languages, coding is a difficult job. Coding is both a science and an art. You need to do the right thing, but you need to do the right thing in a clean and beautiful way so that other people can read and understand. Code that is meant for this thing called write once code, which no one reads. You sometimes do that to finish a project quickly and get some numbers. But any code which you expect any other person to ever read should be written with some care. So we'll see examples of that throughout the semester. So how do you say it in C++ specifically, as a specific high level language? So you have to declare what's called a function or a procedure. By default, the first procedure where control is transferred when you execute your program is called main. In this case, main is very simple. There is no input argument, no output. So it's just main with a bracket saying, nothing is due, I don't give you back anything. And then you start the body of the procedure within curly brackets. The first thing you do is you declare a variable. The variable declaration has a type and a variable name. The type is float, which means it's a floating point or a real number with a given precision, which we'll talk about next. So Fahrenheit is a variable name. It's declared to have the type float. In the next line, so execution goes line by line, roughly, unless you change the flow of execution, which we'll discuss two lectures from now. The next step, what you do is take input from the keyboard, which is called console in or C in, into the variable called Fahrenheit. This operator, which is two greater thans, reads some input from C in and send it into Fahrenheit. If you have given a legal value for on the keyboard, then at this point, after this statement is executed, Fahrenheit contains that number. The third statement does the real work. It declares a variable called centigrade, which is also a floating point number, and also initializes it to whatever expression we had in mind. This is an arithmetic expression. This is a very simple expression. You can have incredibly complicated arithmetic expressions, logical expressions, all kinds of things. At the end of this centigrade is, hopefully, set to the correct value, and then finally, C out or console out is a logical name for the screen. And centigrade is now sent out to C out. Internally, that translates into reading the register corresponding to centigrade and pushing it out into that location corresponding to the display. Of course, I've omitted a lot of detail here. In fact, this will not even compile as I've stated. We'll see that once we start demoing things live in this lecture. But you have to save this source code. It's called source code because you're writing it to a text file, typically with a .cc or a .cpp extension. And then you compile that source code to what's called an executable file, which can directly run. And then you call the executable from what's called a shell command line. When you log into a unique system, you get what's called a shell by default bash. And that gives you a command line where you can type a command, and it executes. You basically say, OK, execute my compile program now. It waits on the keyboard. You enter the Fahrenheit value. It outputs the centigrade value, and it finishes. That's how you say it in C++. Now, to make all this work, there's actually an incredibly complex stack of cooperation between hardware and software. At the lowest level of the stack, we've already seen what exists. There is the motherboard with its CPU and RAM and hard disk. And then there's the monitor and the keyboard. The hard disk in particular stores that source file, the C compiler or the C++ compiler itself, the executable file that you are going to run, and many other things, the operating system itself before it loads. So on top of this hardware, managing hardware resources is done by the operating system. In our case, it will be Linux, but most of you have experience with Windows. The operating system is in charge of managing all that hardware. So as I was saying before, the CPU is faster than the memory is faster than disk, and the operating system actually manages and juggles hundreds of processes or independent tasks on the system. Some of the tasks may be monitoring the state of a printer to see if it's free or it's out of paper. Another task may be monitoring the network connection through your Ethernet cable. Another task is managing the display, making sure it's refreshed, Windows are showing in the right order, and so on. So all these are managed independently. Each task is isolated from other tasks to the extent necessary. They each operate within their own address space. Each thinks it has a block of memory. The operating system multiplexes and reuses the RAM between those tasks. When one task is busy waiting for disk, the operating system pulls in another task and does some work on that. When the disk is now ready with the data, it swaps out the intermediate task and pulls back the original task so it can make progress. It makes sure that tasks are not waiting on each other and get deadlocked, and so on and so forth. So the operating system is the most complex piece of software, typically running on a computer. On top of that, when you log in, you get what's called the shell command line or bash. So the bash does a few utility things. You typically have what's called a home directory on that disk. So when you log in, the shell assumes that you are in that place on disk. Whatever command you issue, if you create a file, if you delete a file, all that is interpreted with respect to your home directory unless you move off somewhere. When you move off somewhere issuing more commands, the shell remembers that. When you write a file, it's through the shell that you access the directory path and then you save things there. And on top of the shell, you have your C++ exitable program running. When you say a convert centigrade to Fahrenheit or vice versa, your program is actually invoked by the shell. And then your program is given permission to interact with the operating system and through that to the physical resources. So that's how the logical layout looks once your program is ready to run. Any questions so far? So if not, let me highlight that coding, boring, text console programs to read a number and output a number is by far not the limits of computing. And you already know that. We can also do visual programming with total graphics. So what's that? So there's this turtle who is holding a pen. The pen initially touches paper. And the turtle can do the following things. It can turn through an angle. It can move some distance in a straight line. And it can move the pen up and down. So if it lifts the pen, then if it moves, it doesn't mark the paper anymore. If it lowers the pen, then any move is marked on the paper. So with that set of rules for the game, look at this program here. I tell the turtle to move forward by 100 units, linear units. Whatever direction it was initially, suppose it's facing east. Then I tell it to, so that's what the pen describes on the floor. And then I tell the turtle to turn left or anti-clockwise through 90 degrees. And then go forward again by another 100 units. Then I tell the turtle to turn left again by 90 degrees. And then go another 100 and repeat the fourth time. So what happens, it describes a square. And this is, of course, some elementary mechanism. But you realize that by building up on top of primitives like this, you can draw arbitrary polygons. You can shade with color. You can smooth colors. You can do animation by switching in these things rapidly one of the other. So you can do graphics programming. You can draw windows. You can draw buttons. You can take user input in the form of clicks. You can do everything. So there's elaborate sort of recipe or path from just understanding scene and see-out and a few floating point numbers to building whole operating systems to actually code user interfaces and basically write all the software that drives the world today. But in this particular case, to draw a square, we have to give four instructions repetitively. So this is repetitive and boring. And so in fact, you can use blocks and loops. So if you want to draw an arbitrary regular polygon, then the input would be the number of sides in the polygon. And the recipe would be repeat num sides times moving through that fixed length of 100 and then turning anticlockwise through 360 divided by num sides degrees. And we assume here to start with that 360 can be divided uniformly without a remainder by num sides. If not, we'll get some weird shape. What will we get if 360 cannot be divided by num sides evenly? Anyone? What will happen? Can you guess what the picture will look like? So at the end, it will not meet up because of. So generally speaking, integer, you might regard this as an integer division of 360 divided by num sides. And so something funny will happen. So I'll leave you to try it out. But let's see if I can give you a quick demo of what you can do with Turtle Graphics. So I have this program called square, which is named dot exe, which draws that square. So if you run that, that's what it looks. That's the turtle turning and tracing the line. So that's just four lines. That doesn't look too exciting. What's the code for like that? Code for square. So square dot cpp, that's the Turtle Graphics program. Looks exactly what I said. You have to do some more initial setup and tear down. So you have to say turtle simulator or turtle sim, then forward the weight of one, prevents the fun from getting over too quickly, and then turn left and go forward and so on four times. And finally, it waits for five seconds and then closes the window. So that's all fine. This seems to be too much work to do something trivial. But I won't show the code for Hilbert right now. But once you master function calls and procedures and recursive function calls, you can do far more exciting things like, say, draw Hilbert curves on the screen. This is actually as easy as drawing a square. You might not believe that. But once you see the code, you'll understand that it's really simple. It's just less than a dozen lines of code by which you can draw Hilbert curves of arbitrary dimensionality. It doesn't forget what to do. It knows at every junction exactly which way to turn, almost by magic. So you can keep going on that, but I'll stop it for the moment. So that's sort of the power of simple and elegant coding. You can write very complicated piece of things very, very easily. So let's get into how C++ works in some more detail. So earlier, we saw that we have to declare this procedure called main, which is the default procedure where control is transferred when your program starts up. And then there was a data type called float. And there was a variable declaration. There was an arithmetic expression. So let's go through these issues one by one. What is a procedure? What are data types? Where did the scene and see out come from? And the rules of writing arithmetic expressions. We'll look into each of these in more detail in subsequent lectures. But for the first lecture, let's see how this works. So let me take a text file and try actually something much simpler. So here is a very first C++ program. You declare main. So you say console out to take the string hello world. And at the end, there is this escape character. I'll come back to this in a moment. Let me drop this. You just say print hello world to screen and then go away and do nothing. In fact, why not even drop this? This is like simplification by amputation. There's nothing to do. So that's my code called first.cc. How do you compile it? You say g++ first.cc. So g++ digests that source. This is OK. There's nothing to do. I'm pretty happy. g++ creates this file called a.out. How does a.out look like? So ls-l lists the file and tells you some properties of it. So there's this file called a.out. And it's a file which is 4,845 bytes long. After compiling a file, which was 11 characters long. So that's a lot of bytes of executable to do nothing. But then by the time we write longer programs, we'll be rewarded. Now if I run this, and we run it by saying in the current directory dot, run a.out. What happens? Well, precisely nothing. You told the program to do nothing. So nothing is what you get. Now let's try to print out hello world. Every statement in C++ ends with semicolon. It may take one or more lines. So C++ doesn't care if you put this space here or not. You could write it like this. That's all fine. Now if I save this, and then I try to compile again using g++, it says, I don't know what cout is. Where did cout come from? So then you realize that cout is defined by a c++ library, which is called iostream, input output stream. And you start using that by saying hash include like in C. And you say iostream, like so. If you do that, and you try compiling it, it says, well, I still don't know what cout is. And the reason is that cout is kept in what's called a namespace. So let's go back to the slide and see what happens. So these variables are not defined magically. To use them, must prefix our C++ code with instructions to include a header file, like hash include iostream. And this doesn't quite work, because there are these things called namespaces. Suppose there are two Ravivarmas in IIT, they live in hostels 2 and 5. To avoid confusion, they have different role numbers. But if you haven't assigned them numeric role numbers, an easier way to deal with the ambiguity is to say h2 colon colon Ravivarma or h5 colon colon Ravivarma. So similarly, all these standard libraries in C++ are defined in a namespace called std. And so instead of writing just c in or c out, we have to say std colon colon c in or c out. If you're tired of doing that all the time, then you have to give the compiler a directive that you will be using the standard namespace throughout the code. And then you don't have to do it. So let's head back to our code and see how that works. So either we could do this that tells you that c out is something defined inside the standard C++ library. And this time, if we compile it, things go through. Because it's found what was meant by c out. Inside C++'s library, c out is an elaborate path eventually true to that special memory segment which is recorded without further display device. Now if we run a.out, you get that. So as you can see, every time I'm getting returned, some command is being executed by the system. And after the command is executed, there's a prompt which ends in that dollar sign. So what happened is when I executed a.out that resulted in printing hello world, and after that it immediately printed my name and the name of my laptop, followed by the current directory which is slash temp, followed by the command line again, the prompt dollar sign again. If I want a new line, then I have to tell it explicitly like we saw before. So I have to insert a new line like that. Why is new line backslash n? Well actually new line should go to the new line like this. But the C++ compiler will think it's part of the input and it's not what you want to print. To tell the C++ compiler that the next character is not to be interpreted literally, but it's something to be recorded to send to the output. These unprintable characters are replaced by what are called escape codes or escape sequences. When you put this backslash, it alerts the C++ compiler that you don't actually want a backslash. You're about to say something special after it. In particular, you want a new line after it. There's a variety of escape characters that you can use like this. And once you say that print a new line at the end and then you have to decompile it to take effect. And then if you execute a dot out, this time you get a new line after hello world. Now let's go back and clean up a couple of things from the slides. So what's a procedure name or what's a procedure function? A procedure or a function encapsulates a piece of computation and gives it a name. So this entire act of reading a Fahrenheit temperature, converting it into centigrade and printing it out is called main. The symbolic name main is reserved for this piece of action. So the advantage is that we can write this function once and then reuse it from many places as we need to. For example, you could write a procedure called max which takes two integers and then returns the larger of them. So max of minus three and two should return two. You declare a function or procedure by add two things to whatever we're doing with main. There is the input argument list where you're saying accept an integer a as an integer b as input in that order. And once you're done return another integer which is supposed to be the larger of them. And then when you call max minus three two you should get back the integer two in response. So earlier in the turtle graphics code forward and left were such functions. They were not passing out any values. They're directly affecting the display. So procedures can do all kinds of things. They can read the keyboard. They can update the display. They can return an integer. They can return more complex things and so on. So that's what roughly a procedure function is for starters and we'll see more later on. What's a data type? So we said that Fahrenheit and centigrade were floating point numbers or floats. A computer memory is as I said fundamentally a 2D array of bits. They have eight columns, one byte or b. And the number of rows depends on how much main memory you buy. One gigabyte means some, slightly above one billion rows. Hard disk is similar in look and shape but it's only larger and slower to access. Meanwhile what programmers want are integers, real numbers, complex numbers, characters, strings of characters so that you can put in people's names and their addresses in databases for example. Arrays, matrices, variable length lists, mappings between say people and their age and things like that. Eventually you want user interfaces like windows, buttons, menus. So during this course we'll study for quite a few of these abstractions how these data types are mapped to representations in RAM. But most of the time we are customers here. We want to understand how to write interesting or useful code using high level data types. But we'll see a little bit of representation on the way. There's also this issue of how do you choose variable names and it's important to have good programming practice right from the beginning. So C++ we allow any sequence of characters, all the capital letters, all the lowercase letters and digits and the underscore character in variable names. You cannot start with a digit because then the compiler will get confused about whether you're trying to communicate a number but otherwise there's a rule on how they're interleaved in the variable name. So variable name is totally up to you but of course you should practice writing long and informative variable names and short ones like just X or just Y. So why do you want long variable names? Then you can search for them easily other people can read and understand the meaning easily. And now as we use integrated development environments and not just plain text editors, most of the times variable names will be auto expanded for you. So it's not like you type too much more if we use long and informative variable names. In old C code you see variable names which are written in old style like old style variable name with underscores separating words. But in more modern coding in C++ and Java it's more customary to use what's called camel case where word transitions are marked by one uppercase followed by lowercase letters. So new is we'll start with a lowercase then the transition to style the next word will have a capital S in it and then it goes to lowercase again and then the next word variable will start with capital V again. So this reduces the number of underscores you have to type and people claim it looks a little easier to read. There's nothing religious or there's no cosmic truth in this. If a lot of people use the same convention you do well to start following the convention because then people can read each other's code more easily. As I was saying there's no hard and fast rule about laying out your code either provided you follow the basic rules. If you look at say this source code C++ will of course not work if you run the float and the Fahrenheit together because there's no way to distinguish whether you mean the type float or you're declaring a new variable. So there has to be a space between float and Fahrenheit but if after that semicolon the very next character was C that's okay because C++ understands that you have terminated the first statement and doesn't care that you didn't put a space or new line. So but even there there are broad conventions modern coding tools will automatically format your code for you to make it look presentable and easy to read. But even otherwise it's a good idea to format your code nicely. So here's the first complete source code. Reading Fahrenheit and outputting centigrade are reading. At your shell you have to compile it and then you have to run it. So let's try that out. So in second dot CC we include IO stream and this time because we don't want to keep typing STD all the time we say using names for the STD for the rest of the file, okay. Then we start the main method we declare float Fahrenheit. We read it and then we create the centigrade variable which is set like that. Now one for the comment which we will elaborate on in the next lecture is how to write arithmetic expressions. So as we saw what in handwriting you would just say five times Fahrenheit minus 32 the times has to be made explicit to the star because the compiler cannot read your mind and the other thing is these brackets are vital. Like this bracket and this ending bracket because star has higher precedence compared to subtraction, okay. We'll see what that means next time but roughly it means that if you drop the bracket then you would actually first multiply five with just Fahrenheit and then subtract 32 divided by nine. So you don't want that to enforce the correct order of arithmetic operators you sometimes need to use brackets so that it is unambiguous to the compiler in what order to do things, okay. And in this case it's necessary to give the brackets. And finally the output centigrade and because you want to clean up the screen we also print a new line after that. So this is the code. And now let me compile this code. This invocation will overwrite the old a.out it will no longer mean a compiled version of first.cc. Now it will be the compiled version of second.cc, okay. Now if I run a.out it just freezes on the first line because it's waiting for me to enter the temperature in Fahrenheit, okay. So if I enter 32 degrees Fahrenheit that's zero degrees centigrade. So it prints zero and it quits. That's what it has told to do. If I enter something else like the minus 40 which is supposed to be the same in both well it's minus 40. So that's what that, okay. So my laptop is still working. What if I just want to say that it's very cold, all right. So cold is minus 17 degrees. How did that happen? How about I want to say it's very hot? Hot is also minus 17.7 degrees. So it's really going on. Well on a hunch you do this and it says zero. Minus 17 degrees, okay. So what happened here is one of the legacy difficulties with CNC++ which is very poor error condition handling by default. So I was entering things like hot and cold which have no arithmetic meaning. And rather than, you know, show an exception or give a compliant message to the user saying you just entered some garbage. It happily accepts that character sequence. It initializes the variable to zero and it does whatever it had to do with it, okay. Now in this particular piece of hardware with this particular version of the compiler it somehow has decided that zero is the right value to put in when the input doesn't make sense. If the phase of the moon changes or I buy another laptop there is no official guarantee that the default value will be zero. So you have to avoid situations like this, like the plague. So this is not a good thing to happen if you are using this code to control the rockets in a space shuttle or something like that, right. So while coding C++ you have to be very defensive about checking input for legal values. If the input is not legal you have to catch it somehow manually you have to then print out a message to the user saying, you know, this is not a legal value please enter a real number as input. In subsequent programming languages like Java this has been cleaned up a lot. So Java has this elaborate exception mechanism by which the system itself will check that the input string is actually parsable into an integer or a floating point number. If it's not that way it will throw an exception and that unless you catch that exception it will percolate right up to the top it will halt the process. And so this doesn't make sense I'm not going to proceed any further. Now that's only mechanism. It only supports the programmer doing certain things. It still doesn't give you logically what should happen. If you're writing a space shuttle controller in Java you then have to be careful about how you catch exceptions and what you do with them. So, you know, it's there's no free lunch. You still have to deal with exceptional situations. It's just a programming language gives you more support for dealing with them. C++ also has exceptions in later and C compliant versions. But default C++ is such that you can happily code along without worrying about exceptions which Java won't let you do. And if something goes wrong well it just goes wrong silently most of the time like this. You try to open a file for reading which doesn't exist. Well, I'll just give you a null file pointer and you have to check the null file pointer. If you don't do that you can still continue getting garbage. So that's one problem with C++ that you have to take care about. So, in the first lab session we will familiarize ourselves with the Linux desktop. Many of you have already used Linux and almost all of you have probably touched a computer. So desktop is no big deal. The taskbars and you can fire up applications. You can open a terminal window that's called the bash shell. So this is called a bash shell, okay. What is it doing? It has this loop in which it waits for input on a line. You hit return, it interprets the line as a command. It executes the command and it goes back to listening for you. So now it's listening. You say, what's the date today and the time? Prints out the date and time and returns to that prompt. If I say print a calendar for the current month I say cal prints out the calendar with the current date highlighted. That's its job. I tell it to execute a.out, it executes a.out and so on and so forth. I tell it to edit using the visual editor vi the file first.cpp, it opens the file and let me edit it. So many of us know all that. So we are usually located in this thing called a home directory and then directories form a tree structure on the disk and then we belong, we have our home in one particular directory. And in the lab session you will open a shell, you will run some shell commands to get some experience. You will use a text editor to write C++ files. You'll compile it using G++ and you'll run the resulting executable file. So how are files organized on disk? This is an even more complicated story than just coding small conversion programs. File is just a sequence of bytes and just like beauty is in the eye of the beholder different files are interpreted differently depending on who is reading and writing them. For example, CF.cc is a text file. You write it as if you're writing a letter but then that's read by the C++ compiler as a precise description of a recipe to do some computation and then that's turned into a.out which is an executable which is not meant for human reading but which can be run by the operating system. Remember we saw that the empty program turned into four kilobytes of stuff. So what's in there? Let's see what goes in there. So first.cc was printing Hello World, right? So let's compile that again and that creates a.out. Remember a.out has length 5.9 kilobytes. What does it contain? To see what's in a file without necessarily editing it you say less. So it's a less first.cc. You'll see that file over again and if the file is long less will let you scroll through it in kind of interesting ways. So for example to-do list. Now if I try to less the file a.out less gives me a warning that it this doesn't look like a readable file. You still want me to show it and you say yes see it anyway then it prints weird stuff. So this is all binary code which you're not supposed to read. Here and there you see pieces of things that look like text. So it tells you that this executable has been linked to work with the Linux kernel. In particular it's using GNU versions of the kernel. So some bits and pieces of string are inlined with various access points. We are using the C library. We are using the GNU C library somewhere. We are using the lib std c++ so that std package which offers c in and c out. Reference to that is kept here and so on and so forth. This is all executable code and finally our string hello world comes here. So hello world is embedded somewhere inside it which tells the program what to print. Now if you want to do some reverse engineering and find out the name of the programmer who perhaps wrote this code and either give them a donation through PayPal or send an assassin after them you can find that out by issuing this command called strings on a.out and if I do that you'll see that everything will just scroll past in a flurry and you can't see anything. If you want to see it slowly you pipe it. So pipe is this vertical bar which passes the output of one program into another program. So you pipe it to less which shows you the contents a few lines at a time and so these are the strings extracted from the executable file. So this tells you how that program was linked with the GNUX system and it gives you names of various system procedures which have to be fired up. The list of libraries which are used by the current code and finally after all that you see that finally your string hello world comes at the end. Now if all files in your disk were kept in one place it would be impossible to find anything there will be millions and millions of files. So files are organized in things called directories these are just logical abstraction which are implemented on top of the disk bits. So a directory is something like a container that can contain either files or it can contain other directories and this leads to some kind of a hierarchy but files cannot contain directories. The root directory is called slash and then path components are these character sequences. So there is a directory called slash user the directory called slash home and inside slash home cs101 is a user and you are another user in user there is a directory called bin in which all the applications are kept like bash itself the shell which was running as compiler there are all files inside user bin okay. Let me give you a quick tour of how those things look like in a Linux system. So if I go to slash the path changes to slash so I am at slash and if I list the directory in short format those are the files and directories inside there so bin, boot, dev, etc these are all directories whereas initard.image and vm linas is not a directory so if I want to list home I say ls home and home has 4 users under it cs101, library, shoman and cc now suppose I want to move into the home directory then I say cd change directory to home and then the prompt changes to slash home now if I say ls I get that listing back again so I move around logically inside this tree structure hierarchy how do I move up to slash again? I say cd dot dot dot is defined as the current directory you often go to this large buildings or malls where you see this map where there is a red dot says you are here so that is why it is dot, you are here and dot dot is go up to the parent so if I say cd dot dot I am back at slash okay now how do you know what directory you are I mean the path is already printed here but if you still want to know it you say pwd present working directory it tells you I am at slash say anything after that it means go to my home directory and that is for me slash home slash shoman when you log in you are designated one particular path in that tree where you are at by default that is your home directory okay now what else can we do that is my home directory there is lots of files in it another way to list files is to do ls dash l for long format let us see what that looks like so there is this file called apc test dot output it is 11,421 bytes long it was last written on December 26th 723 p.m it is owned by user root who has group root I am user shoman I have group faculty okay this is a file this is a directory code is a directory so that is why there is the initial D this file can be read and written by me it cannot be directly executed because it is not an executable file it is just a text file it can be read by my group the group called root it can be read by everyone else in the system okay so code is a directory I can read write or execute programs inside it others can only read and execute but not write anything in that directory so it is self group others so other faculty members can go into the directory and look in but they cannot write it and everyone else other than faculty can do the same in this case drop box is a directory where only I have full permissions but group or other people do not have any permissions so the permission is lacking it is a dash when a permission is there it is a letter so that is how unix file systems are organized and once you start playing with your system you can figure all that out