 Now, that we have discussed pointers, we will discuss some issues related to parameter passing, which could not be clarified earlier. So far, we have seen that parameters are passed by value, which means that actual parameter values are copied to the formal parameters. They are saying that we would like to discuss additional mechanisms of parameter passing, which are quite important in actual programming practice. We will see how C++ allows that. We will also discuss the facilities to provide for formatted output, which the C out statement is singularly unable to do. It does work in that it can produce output for all the variable names that we give or the array elements or arrays or strings, but we cannot format the output the way we like. We cannot control, for example, how many decimal digits will be printed for a floating point number. There are two special functions called printf and scanf, which we will be discussing in this context. We will then look at some more string functions. I will only introduce string copy and string compare, but basically we will be using sprintf and scanf to some extent in the context of handling strings, which are given as input on the lines or in the context of combining certain values, putting them together in an ASCII character form as if it is a formatted output, but not throwing it out to the terminal, but putting it inside a string ours. So we will see these functions and we will discuss the more general and important concept of files. You are all familiar with files. When you write your programs, you use gad to create files. These are all text files. They are the files which are looked at by the compiler. Very often when you have to submit large amount of data values, you often create those data values as a part of text file and use the input redirection to read those values automatically from the file. So we know that files containing text data, lines of text, can actually be handled by the machine as if they are typed at the terminal. What we would like to do is we would like to find out whether we in our programs can handle such files directly. The answer is yes. We shall be looking at sequential disk files today, but there are files which may contain information other than text information, encoded information, such that internal representation of binary numbers, floating point numbers, et cetera, could be stored on the disk as is in blocks and you would be able to directly access those blocks, read them, retrieve them, update them, rewrite them. These are extremely powerful facilities. We will discuss the direct access files and the binary files later after the Midsum, but today we will introduce the notion of files. So very quickly to recap, the passing of parameters we have said parameters are passed by value, which means if I have defined a function f, the function has two parameters x and y and in this function, I am evaluating some value sum is equal to 5 into x plus y module or 10, some arbitrary calculation and I will return one value whose type is declared as the type of the function. In actual program, I may write something like int a, b, c. For example, when this program is executed, I have the value of a here, I have the value of b here and I will have a location for c. When I read the values of a and b from input, some two arbitrary values will be given by you, which will go into the locations of a and b. Let's assume that these values are let's say 12 and 7. Now at this juncture, when I invoke f a, b as a part of an expression on the right hand side, the control will go over to this function. Please note that when this function was compiled, effectively locations for x were set up, location for y was set up. So you would have x, y and you would have of course a location for sum which was declared explicit. Note that x and y are two formal parameters which are supposed to receive values from the function and sum is a local internal variable. So when you invoke this function and go over there, along with this will go the values of a and b. These are actually copied to the locations x becomes 12 and y becomes 7. So effectively this value goes here and this value goes here. The value is actually copied. That is how I understand. Eventually when the calculations are done inside the function, 5 times x is how much? 60. 12, 5 is a 60. And y modulo 10 is still 7. So what will be sum? 67. Since I am saying return sum and since the return value is integer, this value 67 will be returned as an integer. Where will it come back into C? No. It will not come back into C. It is a small fine distinction that you have to make. A function returns to the point of invocation. The point of invocation is this function called FAB. So therefore with this value you will come back here and this value 67 will actually replace the reference to the function. In short F A comma B now becomes 67. It is incidental that on the left hand side we have C to which this value 67 would be assigned. But there could well be an expression F A comma B plus 13.7 into something else minus something else in which case this will only replace one reference. And then finally whatever is the expression will be evaluated, the final value will be C which happens to be 67 in this case. Is that understood? This is the mechanism. So what are the possible problems with this? Actually we don't see any problem as long as we have to carry out things of this kind only where we have to send some actual parameters to formal parameters and then evaluate some value and return that value. The problem starts then we consider that if I have a large number of parameters I will have to copy a large number of them. It's okay if they are simple parameters like int, A, float, B, car, C whatever whatever maybe 10, 15 parameters but still there is an overhead. Those 10, 15 values will have to be copied to locations there and a return value will have to be brought. What if a parameter is a large array? A thousand. So thousand elements will have to be copied. What if I am trying to do a matrix linear equation solution kind of thing and I want to send 100 by 100 matrix, 10,000 elements will have to be copied. The overhead indeed would become very considerable. Additionally and this is the more important part. While I am sending 10 parameters to the function I am getting back only one value. There could be occasions when I may want more than one value back. Now since the return value by a function can only be one. I cannot return more than one value. The only alternative way is to have the value come back to the parameters which I have said. Which means that if some parameters are modified inside that function that modification should reflect in the actual parameters back home. Unfortunately that is not permitted by C plus but such could be a valid requirement. In particular when I have arrays as parameters forget the overhead of copying consider this function int simsol int n float c 100 comma 100 float b 100 float x 100. Do you recall what these would be? Float c 100 comma 100 could be a coefficient matrix. Float b 100 could be a right hand side matrix. And float x 100 would eventually contain the values of the unknown 100 variables. Of course I may not have a 100 by 100 equation set. So I will pass n as a parameter. So first of all the overhead even if n is 10 or 15 you're solving a set of 15 simultaneous equations. As far as the function is concerned it has no clue that you're solving 15 by 15 equation. Okay because it does not understand the semantics of what you're doing. It will take a 100 by 100 array c copied there 100 array elements of b copied there and 100 elements of x which actually in the main program would have no value at that time someone defined value to copy. And eventually it will stupidly return only one value which is int. So whatever I do I will get back only one value which is not adequate. Indeed in such a situation this int is meaningless. Even if it returns some value what will I do with it? It is of no consequence to me. Actually I want many more values. N values of unknown variables is what I want. But we note here that even in such situations where this int may not be of any consequence I may still use it effectively for a different purpose. I may for example return a value 0 to indicate that problem was solved properly. I may return a value minus 1 to indicate that the value was not solved properly. Or I may choose not to return any value. I would then like to have a provision in c++ function definition which will explicitly tell c++ compiler that this function will not return it. However the larger problem is still to be answered. How do I get my x100 back? Here is another case function to swap values in locations a and b. Here is a main program x, y, x is equal to 5, y is equal to 10. How do I swap values? Normally I will have x here which is 5. I will have y here which is 10. So what will I do? I will announce a temporary variable temp. I will say temp equal to x, x equal to y, y equal to temp. Suppose I decide to write a function. So what I will do? I will say swap x, y. That means please go to the swap function. We take the values x, y. Now I won't extend y to be swapped. I am really not interested in what the function returns as a function value. That is why on the left hand side I have said dummy equal to this. I don't care what that value. But look at what will happen inside the function. When I go to this function, I have a location a for formal parameter. I have another location b for another formal parameter. And I have a temp here. As we understand the function invocation, the value of x will be copied to a. So 5 will come here. The value of y will be copied to b. So 10 will come here. Very good. Let us look at what the function does. The function says in temp, temp equal to a, a equal to b, b equal to temp. So you understand what this will do now? It will actually swap the values of a and b. So a will first come to temp. Then a will get the value 10 here. Sorry. This will become 10. And then this will become 5. So far so good. a and b have been swapped. But when I come back with this return 0, where will I come back? I will come back here. What will be the value of swap as a function? 0 because I am returning 0. And that will get assigned to dummy. That is 5. But does anything happen to x and y here? No. Because a and b remain where they are. They remain swapped, but I don't get any reflection. This is nonsense. So this cannot be accepted. Unfortunately, in c++, the only permissible mechanism is to transfer parameters by value. So we have to look for something else. C++ fortunately provides this mechanism by using a facility called passing by reference. What does passing by reference mean? Instead of passing an actual value, suppose I pass a pointer. So instead of passing x, I pass the address of x. Instead of passing y, I pass the address of y. Obviously, when I pass addresses, they can't be held in normal variables in the function. So the corresponding formal parameters must also be pointer parameters. Now inside that function, if I operate upon contents of those addresses, since those addresses refer to my memory locations here, operations will take place on my memory locations. When the function returns, whatever dummy value I get, the pointer values which I have sent will not come back. But because those pointers have facilitated operations on my actual value, I will effectively get my parameter state. That is exactly what is done here. So there is no copying of arrays. You just pass parameters. Of course there is copying, but there is no copying of values. There is copying of pointer value. Here is passing by reference. So what am I doing here? Int main. Int x equal to 5, y equal to 10. This is x is 5. This is y, which is 10. Notice that when I call swap now, I am saying and x and y. What does and x do? Address of x. Let us say the address of x is 10,000. Let us assume that x and y were given consecutive locations. It does not matter whatever be the address of y, but let us assume this was 10,004. So these are the addresses. Please note that this swap call, what it will take to the function will be these addresses 10,000 and 10,004, not the values. Obviously these addresses cannot go and reside in integer or floating point locations. They have to reside in a pointer location. We will see how that function could be written. Notice that I am not expecting any value to come back from swap. That is why I have not said something equal to swap plus something. I have simply said swap. I don't care that swap to return me any value, but I want x and y to be exchanged. Let us look at what happens inside the function. Int star a, int star b. Now look at these locations. This is the location for star a. This is the location for b. Actually not star a. It is the location for a, location for b, but a is a pointer. b is a pointer. It is int star. When I say swap and x and y and x means pointer to x will come and sit in a, which is what value? 10,000. What was the value of pointer to b? 10,004. I mean pointer to y. These pointers have come here. Now look at the actual internal handling. Int temp. So this is a temp location. The actual statement says temp equal to star a. What does that mean? Assign to temp a value which is pointed to by the contents of a, because a is a pointer. So actually at this time, a reference would be made to the location 10,000. Please remember the entire memory is uniformly available to my program, which has been allocated. So I will actually go to 10,000 which is location for x. So I will be collecting x, putting it in temp, then collecting y, star b is y, putting it in x because star a is x and then putting the temp into star b which is y. So obviously nothing will change in a and b but effectively x and y will get swapped in the original term. Notice that in order to emphasize that I don't intend to return any specific value calculated by the function as such, I write this function as void swap. Void is a c plus plus keyword which says this particular function will not return everything. Void, null, nothing. So therefore it will be something equal to this function. In fact this function will always be invoked as a standalone statement. It will just do whatever it wants to do. Can you see how powerful the system is? Please note that c plus plus is not violating the fundamental principle. The only mechanism available is still passing by value. But we are sort of cheating that. We are saying we will pass the value but not an actual value but a pointer to that value. And then whenever you operate, you come back and operate actually on my real parameters. Now this is another thing which we never mentioned explicitly but this is what happens automatically. Whenever you name an array as a formal parameter and as an actual parameter you pass another array, arrays are never copied. The pointer to the base element of the array is always transferred as a pointer which is retained inside the formal parameter. Please remember array locations are always allocated contiguous space as we know. Given the pointer the function will be very well able to calculate which location it means even if I manipulate an array a i j if it knows the address of a 0 it can find out where a i j or where a k is. So therefore when you pass arrays fortunately there is no overhead of passing any large chunk of values by copying them only a pointer will. What it also means is therefore anything that you do to the arrays in the function you are actually doing it to the original arrays. So obviously if some changes happen there those changes will happen here also. Is this clear? So this is how the parameter passing through reference works. Yes, what he is suggesting is a cute way of solving the swapping problem. He says if I want to swap two values X and Y I will put X in the 0th element of an array Y in the first element of an array pass the array change a 0 and a 1 and come back and then I will assign the 0th element to Y and first element to X. It is a moot point as to which involves more Godagiri putting X into 0th element Y into first element then going there they are again putting this element here and X and Y would perhaps be simpler. But it is definitely a clever solution whenever you have large number of values which you want to modify put them in an array array will automatically. However knowing that pointer passing mechanism is what is facilitating this you might do well to pass arrays as arrays because they will go as pointers and you pass other parameters as pointers and deal with the pointers that is relatively in fact that is the standard way people write programs. It is now in this context of pointers that we have understood in the function calls we will see for some additional functions for input output as we know C++ does not have any internal instructions for performing input output there is no input output in set C in and C out which you have been using as we shall see later after the mid seven we study objects and classes and the operators and so on C in and C out are called spatial streams and greater greater less less are the extraction operators so they work C in for example works on input which is typed as a string on your terminal it takes the string and extracts one value then again greater greater another value greater greater another value etcetera it has two variables it will ignore blanks on either side it will go beyond the new line character etcetera and of course it must find correct input for floating point fixed point etcetera so that there is no problem. Now one of the biggest issues with that is that if I have to write my name say Deepak blank Fata and read this as a single string there is no way I can do that with C in because the moment first blank occurs C in believes that the string has terminated. So there will similarly when I output using C out I say C out x where x is equal to seven point three four eight two one five or where x is equal to zero point zero zero zero zero zero one five six something we have no control on how that x will be printed it will be printed in an intelligent fashion but depending upon the whims and fancies of C out we would like a greater control on what we print and how it looks and we will also like a greater control on what as key characters we read and how we interpret them ourselves for that there are functions called scan f and print f. So these are the two functions which perform formatted input and output the way these functions are written is there is a format string which is written as the first parameter of print f this is followed by large number of values or variables whose values are to be output what C plus plus does it it takes that format string and applies specific format specifiers to those variables what that string is what are the format specifiers let us look at some example to understand this here is an example print f percent d is a number backslash n capital n obviously capital n is some int value declared let us say n has a value five twenty three now when I execute this print statement what happens is C plus plus starts looking at what is to be printed there is only one variable n there could have been n m p whatever but there is one here now it looks at this format string this percent d in fact anything starting with a percent is a format specifier so whenever it sees percent d it says ah this is telling me how to print the first value there could be percent d another percent d percent g we shall see what other parameter other format specifiers are one each will be used to be associated with one of the variables the remaining string will be printed of a bad team as it is just like you say see out hello how are you or give me value of n whatever you print including backslash n in fact you can have a print statement the perfectly valid print statement there is a format string no variable that is why there is no specifier in such case the print f function will simply produce the ASCII characters contained in that string this is typically the first program that one writes in c c plus the greeting program all over however if you want to print some text but also print some values by converting the internal form of those values like binary floating point etcetera into ASCII characters then that is where the format specifiers come for example this percent d actually corresponds to the conversion of an integer number into corresponding ASCII character remember this 523 internally will be a binary number a 4 byte number but I would like it to be seen as 523 that is how I read it symbol 5 symbol 2 symbol so this is what this particular percent d will do these are additional format specifiers apart from saying percent d you can also say percent 6 d for example this number that you write before the character d is called the field width it specifically says that whatever value you print it has to occupy six spaces so look at how you can specify exactly how your output will look like suppose I print a value which is 523 then this 523 being a numerical value will be right justified in that field so I will have blank blank blank 523 that is exactly how you would like numbers in a table for example so they would look this way notice that in c out you have no such control a 4 digit number will come as a 4 digit number 3 digit number will come as a 3 digit number here you can control it percent 7 s will print a string so if you have an array care name 60 of course in that element 60 element array you do not expect to have a larger string otherwise you would have used a larger width than 7 what this means is first 7 characters of that string in the array will be printed and that will occupy 7 space suppose a string has less characters suppose a string has only 3 characters how h o w in the space of 7 bytes or 7 ASCII course where will this how come h o w left most right most well that is that is the problem numerical values are aligned right most strings are always aligned left most so if there are only three characters h o w will be output and three blanks will be output to fill up the remaining strings so blanks are padded for shorter string on output percent 8.2 f will print a floating point value yeah sorry if you are longer string it will be truncated beyond 7 remember it is the compiler which is looking at all your program if your program whenever you have printed statement it is the compiler's responsibility to analyze the format string that you have given to identify all the format specifiers suppose you have four format specifiers and it does not find four matching values it will say wrong statement it is a syntax error this 8.2 f will actually allocate 8 spaces to print a value so suppose you had a value let's say 73.421 let's say I had float x and somewhere I had said if I say print f percent 8.2 f x it is clear that x will be printed in 8 spaces 1 2 3 4 5 6 7 8 the first number is the total weight no matter what happens this width will not be exceeded 8.2 means there have to be two digits after decimal point since there are three decimal digits after this year the last one will go off I will have here two I will have here four then I will have a decimal point I will have three and I will have seven the first three will be black so that is how it will format you have to be of course careful that you provide enough space for minus sign etc etc unfortunately the numbers are very funny like 0.00015 that's a valid number and you are interested in knowing whether it is 1.5 or 1.7 at the end but if you print it in this format you will get 0.00 so you use a format called G here a G format automatically converts the value into exponential notation so this 8.2 G means if I have a value like 0.00015 it will actually print it as 0.15 E minus something so that you are the value is visible there are n number of other details any book on cc plus plus programming will give you the meaning of the format specifiers and what exactly they do for printf statement a scanf does exactly the opposite a scanf also has a format specifier string but there is no output that is to be generated out of that format specific instead some input is being given by you when you type your input the format specifier for scanf tell you how the input is to be interpreted the input is to be interpreted as per the format specification that you do those ASCII characters which come from the keyboard say you write 734 which is supposed to be a value 734 these ASCII characters are supposed to be converted you can use percent d for example again so 734 will come 734 will be interpreted and this will be converted into internal representation of 734 in a location which will have to be prescribed after the format string as variables except if you just put variable names since this is a function the actual parameter will never go back to the sorry the formal parameter will never go back to the actual parameter so when you call the original value will be transferred inside but it will not get back here so to do that you use pointers here that is why scanf will always use pointers along with the function so here are some examples scanf percent d blank percent d and m and n I am reading two values m and n I will give some values it does not specify the field width this permits me to input large value or small value but it does say there is a blank in between these two that means the two values will not be concatenated together after that there will be a blank so whatever the values they come they go into m and n because you are passing pointers here is another one to get floating point values x and y and a string so name is a care array of 40 percent f percent f percent s so whatever string you try here is another example of scanf something which you cannot do using c in at all suppose I had typed a very compact record of bytes I knew for example that in an inventory control system in my manufacturing plant I use a six digit number to identify an item code I know for example that all item names are shortened to six characters and I know that subsequently I will have a floating point number which will give me value of one of those items so I maintain my inventory thousands of course like how do I read this inside I can read it using scanf by saying interpret the first six characters as an integer number and assign that value to a interpret the next seven characters as a string of seven characters and assign it to item code which is an array no pointers interpret the last set of characters whatever they come till you set a blank of course you interpret that the floating point number assign it to x in short then scanf and printf provide a much greater control for you to prepare a neat looking output and to capture any kind of it there are other string functions such as strcmp this function can compare two strings why would you like to compare strings you remember sorting you sort in ascending order of numbers descending order of numbers what if I have given you names and I said sort on ascending order of names if you recall your lab batch are formed in by taking the entire group of 90 students or whatever and sorting them in ascending order of names as they have been published on the AC website how do we sort how will you sort so you will take the names as arrays now you have to compare this element with this element so you want to compare two strings the two strings may be equal or one may be smaller than the other depending upon the lexicographic or dictionary order name starting with a is smaller than name starting with b and so on such comparisons are possible you can do that yourself because you know that ASCII code for a is smaller than ASCII code for b but you will have to do too much work to compare two strings by writing a program yourself there is a function called strcmp you can read about it what it returns when both strings are same what it returns when first string is smaller what it returns when first string is larger but you can make that comparison the other thing which will not discuss in details is a string copy set you are given a string you have an array in which you want to copy the characters from this string here you can always set up a iteration from 0 to length of that string minus 1 and copy one character at a time instead of that somebody has already written it string copy can give you this there is an strncpy or string copies so many characters in which case it will carry only those copy only those n number of characters and automatically put a backslash 0 and b since these are valid string functions they deal with null-terminated strings and they produce null-terminated strings sprintf is an important function which we shall be using in some examples on the file handling what does sprintf do? sprintf does exactly what printf does except printf will produce the output string on your terminal sprintf instead will say I have constructed this output string but I will put it in a string that you tell me so you can actually take a numerical value integer variable floating point string another floating point etc. compose an ASCII output as if it will look on a terminal but instead of going it giving it out to you you say I put it in a string called s say 1000 or s 80 or whatever why would I do that? I might want to compose strings which will eventually go into a large file so instead of putting fprintf instead of putting printf which will necessarily put everything on to that file output file I can put it in string there are variety of ways where you can use this sprintf we shall be using it in one particular example here is an example of sprintf suppose I have a character array 75 elements obviously it can contain a character string of 74 characters only backslash 0 has to be there let's imagine that I have a roll number 1 2 3 4 5 and I have a batch number which is 1 1 2 notice that it looks like nice characters here but internally this is a 4 byte internal binary number this is also a 4 byte internal binary number both are in ordinarily I would have said printf and I would have got these values to look like this outside percent 5d percent 3d backslash n roll comma batch and then say sprintf the only additional parameter here is the name of a string which did not exist in printf all that it means is with this control string and the variables do exactly what you would have done to produce an output except instead of producing that string on output put it in s obviously s has to be large enough to contain the resultant string otherwise that string will be truncated or padded with blanks etc in fact it won't be padded with blanks you will have a backslash 0 character after this output is done you can examine and experiment what exactly sprintf does and so on but what this particular thing will do is it will produce a string 1 2 3 4 5 blank 1 1 2 backslash n why blank? this blank will come because you have a blank here it may not be visible if you put 5 blanks here 5 blanks will come in here sorry backslash n is a new line character the new line character is made part of the string because I said so so actually if you write this onto a text file then it will become an ordinary text file which you can edit using gf because every line ends with a backslash a string does not terminate either by space or enter a string inside c++ is terminated by backslash 0 and that backslash 0 I will have to insert in an appropriate array element up to that is a valid string backslash 0 this string is over what you are talking about is when I give input on keyboard please remember it all depends upon who is reading that input if cim is reading that input then a blank will terminate a value or enter will terminate a value if scanf is reading that input then depending upon the format specified and there is no other than that of reading input that's all so don't confuse between blank or this terminating input and something terminating a string string is a data structure it's an array each element contains a character it will have a series of characters whenever a backslash 0 comes as far as c++ is concerned that string is over what it contains it doesn't matter yeah that when you when you print this string this itself will go to new line you don't have to give a new line separately after that you will come to the next line so next printing will occur at the next point it is not necessary that you include the backslash and by the way you can remove it from here it will not go I just illustrated it to show that any character can be put in that string I am tempted to ask the same question which I asked for a long time I was someone here why the hell would you like to do that a backslash 0 is known to be a string terminator if you insist on putting backslash 0 as a part of your string if anything else is written after that it would be never recognized in s by any function which processes because the first backslash 0 will be treated as having terminated that string but technically to be sure if you insert a backslash 0 yourself as a part of such an operation and then say my operation is over c++ will loyally put an additional backslash 0 there that second backslash 0 in its entire lifetime will not be looked at by anybody because the first backslash 0 will terminate yourself and for God's sake never put backslash 0's in strings not a healthy practice now we consider the notion of a file you are all familiar with files files have names where do files stay files stay on this generally you have files called p1.cpp right so this is one file there could be another file in data dot txt I have shown some boxes here but that box contains a large number of bytes now the way these files are handled by the operating system is different than the way these files are handled by your program as far as operating system is concerned it has a component called file system manager which manages these files that is how you get directories sub directories that is how you get to have names for the files and extensions for the files each file has certain properties which in unix for example you can find out by saying ls minus al some name so you will know who can read who can write into that file what is the size of that file a file has a path from root all of that you know now all that happens in the physical world in the programming world you deal with a file in a logical sense as far as the program is concerned a file is defined in terms of a file pointer so logically c++ treats a file as a large array of bytes such a large array that ordinarily it cannot fit in the memory you know computers have large memory 2 gigabytes 4 gigabytes but still larger this 300 gigabytes so suppose you had a file which contained the census data of the entire country 100 crore people 120 crore people and for every person let's say name address there is that some 200 bytes of information 200 bytes into 120 crores a lot of information and if you think you can easily build a computer you have a lot of money you will buy a larger computer consider the following we are now storing not 100 bytes or 200 bytes per person but 10 finger prints of each person so you will never be able to handle this entire data in in memory but you still have to process that data so how do you do that so therefore you use this notion of files in general the files are broken into records and within each record there are fields the most common example is for every student I enter marks, roll number, name marks roll number, name marks, roll number, name marks so I have as many records as there are students in each record there are 3 fields this is a good example of a text file which you can create for input data but I would like something more from such files when I want to search for a student to find out how many marks that student got I don't want to keep reading every record of the file till I reach that person I would like a mechanism to directly access that student's record is it possible we shall see after the mid-same years it is possible for the time being we just look at the basics of the file where C++ says that as far as I am concerned the file is represented by a pointer called a file pointer which typically is a pointer to this entire structure let's say fp every file will have a pointer associated with it inside my C program I should be able to open this file I should be able to read or write into this file and at the end I should be able to close this file now this is a hypothetical file in actual practice I will like to do reading and writing with this file so there has to be some method of associating my logical file with a disk file and if I want to deal with 20 files on the disk simultaneously in my program I should be permitted to define 20 such pointers associated one with one file another with another file etc etc all that is in fact feasible so this is how the file is handled it is handled through a pointer please don't mistake this to be a pointer to a value there is a value internally but it is considered a file pointer it has a special type called capital F capital I capital L capital E that is called a file pointer I will define a logical file in my program by saying file fp if I want more than one file say I want to handle one input file from which I want to read data and I want to create an output file on which I want to write data then I can say file star in file comma star out file these are all pointers they are called file pointers so in short internally as far as C plus plus is concerned every file is attached to some file pointer as many files as you want to open and process those many file pointers should have been defined and those many file pointers should have been associated with the corresponding physical files before you start doing any read write operation so naturally this association as I said happens when you open a file therefore in order to process anything in a file you must first open it once you open then you can read records from that file if it is open for input operation you can open a file for output operation in which case you can read records from there but you can write so obviously there have to be some special functions which will write to file read from file open a file close a file once such functions are there sometimes there may be error while writing a file suppose disk is full or you open a file you say associate this file with in data dot txt but when C plus plus during execution goes to your directory there is no such file now it will come back saying I cannot open this file but how does it tell you so the function which will open the file will have to return a value usually these values returned are pointers and usually if something goes wrong these functions return null pointer null pointer means sorry I could not do whatever you asked me to we shall see some examples of this first we consider text files assume that you have used g edit and created a file which contains some 5 digit numbers followed by a name the name could be of variable character string but a single string and a third number which you can recognize as equivalent to a batch number so consider the first is number second is name third is batch number I have artificially created this text file such that these three fields are separated by commas in between exactly one comma no blank and artificially I have put an additional comma at the end you will not naturally get such values why I am doing this because the problem that I wish us to solve is to take each string get it from the file internally dissect this string into three different parts and recognize first part as roll number second part as name third part as batch number this is actually string processing but I do it by reading these things from a file what is our interest is to see how files are opened and handled incidentally we will see how these strings are processed to capture different elements so the problem is to read records from the text file extract the three strings from each record in three separate arrays then construct a single string you actually got a single string from input but that string contained commas the way I want you to construct a single string is that first five characters is the roll number next 29 characters is the name so if a name is smaller actually you have to pad it with blanks the string should be exactly 29 and the last three characters should be batch number you are getting ASCII characters from input string so you have to just keep them as it is there is no binary to ASCII ASCII to binary conversion required but we shall see later that that can be so this is the problem let's see how this problem is solved this is the sample line so what is the logic that I will use any suggestions I have to construct ultimately out of this just a string which will have 10108 followed by N I L M A N I blank blank blank blank how many till I become 29 and finally 1 1 1 no comma should come somewhere so suppose before constructing this string I want to make three different strings one I called as a roll the other I called name and the third I called batch can I not say that I will start with some index K which begins looking at this K will be 0 every time I see a non blank character I will put it to the first string increment K take the next character put it in the string keep doing that till I hit comma the moment I come across comma because the rigid specifications I know that roll number has ended and name has started so if I am making an index I to maintain roll number I equal to 0 then 1 2 3 4 etc I should actually hit this comma when I hit 5 here then I will terminate roll I will insert the backslash 0 there and reset I to 0 and start looking at this this name is a bit tricky because name may be smaller than 29 characters so whenever I encounter a comma after the name that is likely to be less than 29 characters so remaining positions from that point to 29 characters in this string I will fill up with blanks and I will do the same thing with the third so this is the logic there is a program which does this we are mostly interested in how the file is handled because all these data is now coming from a file no scene no scanner here is a program which defines the line string 80 so this will be used to actually read the complete string that is typed as in the file so whenever I read a record this is like a record which is one string then I define the student roll number s roll student name as s name and student batch as s batch 6 30 and 4 this is exact sizes I define some index variables look at the way the files are defined and used files star fp that means I intend to use a file fp is equal to f open input data dot txt comma r these two are important parameters the first parameter is the name of the file as is known to the operating system so using g edit if I have created that file and I have called it input data dot txt then that is the name I have I can give some other name if I want to open different files by the same program I can put a character string variable here and first read the name of the file and open that file by giving the character string so these variations are possible r stands for read file so I am going to read the meaning means that please make available to me a caution here instead of r if I write w it means it is an output file I will create an output as I said this f open may bomb if it can't open this file because it was not there or because you did not have permissions to access it then this fp will be null no valid pointer will be written and that is why you check if fp is equal to null and not open file return minus one the program terminates you can't do anything about on the other hand if you come here that means you have got the file open now that you have got the file open you read one record at a time and for each record do this entire stuff of separating out three parts so let's see how very quickly how that is done f get us line string 79 comma fp you would be familiar with the function get s we have used it to get a string f get s gets a string from a file it has an additional parameter which defines the number of characters that you would read if the string that is given on the input is less than 79 because remember how get s is terminated by putting a new line character a new line character cannot be read by get s only get c can read if you want you can also use get c with files the corresponding function is f get c so whatever you can do get s and get c you can do f get c and f get s using files anyway this will read you a string into the array line str now you got an array line str which contains that string this is another thing while not f e o f in bracket fp what is f e o f e o f means end of file f means files end of file if the file has ended for example suppose you have in data dot txt and there is nothing inside the first time you try to read get s operating system will say I have I have not found anything what the operating system does is it will raise this flag file has ended the end of file flag is an important flag in file processing when it becomes true that means operating system says nothing there if that flag is false that is fallen down that means there is material to be available so that is why you test for not f e o f the flag is not set it is false you can test it against true or false while the flag is not set that means I have got some valid characters when I did by f get that is how I come inside the loop very obviously I am now having a valid string through get s I will process that string do whatever partial movement I want to do and I will come back to this while loop I will come back I will read another string now when that string is read and I come back I will process the second string but while reading some string like that in the iteration operating system will say no more string at that time the flag would be raised but I would not know it till I come back to the while loop and the while condition says not f e o f sorry f e o f now flag is raised get out so that is how I will get out what am I doing inside will not spend time in discussion you can look at it all that I am doing is I am running a while loop without any body I am saying while s roll i plus plus equal to line string k plus plus what this will do is it will take the kth element of line string and assign it to ith element of s roll and whatever is the value of this that will be checked against a comma and of course when I finish this entire thing k will be incremented by 1 I will be incremented so I will put line string 0 to s roll 0 line string 1 to s roll 1 line string 2 to s roll 2 suppose line string 6 or 5 is actually comma I will put line string 5 into s roll 5 so s roll 5 will become comma and then I will increase to 6 or 7 when that happens I will have come out when I come out I have gone past that comma the comma itself should not have been there instead of comma I should put a backslash 0 to terminate that string that is why I am assigning i minus 1 to backslash 0 and of course I reset i because I am starting to assemble a new string s net k continues endless you will notice why I have put the last comma after the third thing because then everything becomes identical first part, second part, third part in the second part there is a small squiggle I may notice a comma depending upon how big is that name the name is 5 characters 10 characters I will get out very quickly then the remaining positions from i minus 1 up to 28 I will pad with blanks so this is the blank assignment and finally I will put a backslash so I got these three strings here what do I do now I output this n plus plus s roll, s name, s back n plus plus is a running counter initially n was 0 shall get 0 string this first string this, second string this third string this etc after outputting this string I get another string from the file remember first string I read outside the second string I read at the end of this while logically I should get another entry when I get it I will go back to the while loop and continue if that flag end of flag end of file flag has not been raised that second string will get some time or the other after 10, 15, 20 strings the file will say no F gate so this F gate S will actually be executed but will not get a valid string back but at that point of time operating system will raise that flag somewhere else that flag is being observed by us in the while loop I will come out and when I come out I will say input file has been read and printed so this is a simple way we have understood a couple of things here how to open a file associate an actual file how to read strings from that file and how to test for end of file these are the important concepts rest of it is string processing here is another program which we will quickly look at which actually reads from a file and writes to another file so it has an input file which is a text file and it has another output file which is also a text file so obviously I deal with two different strings I create a line string 80 array to read an input string then I will break it up into three parts compose my largest string and I will put it in out string how will I put it in out string I will create strings in out string by using S print F and then write this entire string just like I said F get S I can say F put S it will put that string on to the output file we shall see that example I have defined two files here 5 star F P 5 star F P out ordinarily the convention is I will say F P in F P out if I have multiple input files I could say for example when I do data processing for example a master data processing in our application software cell where your data is processed the files would be named as student master file subject master file teacher master file and they will contain data for all these and of course in modern days you don't do data processing using files that we did in the older days now you use mechanisms called we shall have a glimpse of what a database is subsequent to the machine but here we have these two files I open this input file this is very standard process with R as the mode for reading if F P is null I will say could not open file I return minus I do exactly the same thing with output file I have given it an arbitrary name student db.text db stands for database so student database.text name of the file I am giving by the way is exactly like when you say save my file in g edit where you type a name so when I say open this file with this name actually I am going to the operating system saying in my directory create a file by this name and open it for me because I am going to write to it w says write here I could know why I will not be able to open a file for input if the file did not exist why I should not be able to open a file for output why will I get null space is not there in the disk a file already exists for the first part your answer is true if there is no disk space I cannot open a operating system cannot open a file another way why it you may not be able to open the file is that the directory in which you are running this program you may have only read only access you may not have permission to write in that directory then operating system will not permit you to open but the second point that you mentioned is incorrect even if a file by that name exists the f opens with w is so strong that it will delete this file and create a new file no that is why the file permissions are there yes that is right so the file permissions will govern in general however when you are doing file processing in a directory you would have taken care of these permissions and the space so ordinarily we expect that you will not hit across these problems of file permissions anyway now this program is no different than what we have seen already so I will not go through this I get a line string while not end of file I process this string I am doing exactly what I did earlier but what is interesting is what I do now I am now preparing an output string and I am writing it to database so the S print f is what I am using S print f will do what it will behave like a print f so this is the format string percent 2d percent 5s percent 30s percent 3s what am I writing I am writing a serial number in 2 digit form I am writing roll number in 5 characters place I am writing s name this s roll is not integer they are all characters but you will see that if I had integer or floating point variables I could easily convert them into a single string like this a ASCII string and then all this string which has been made will be put into out str now I use f put s out str fp out I just write up having written that I read the next input character string and keep on doing this again so it is a pretty simple thing read data from input write data to output yes he is asking f close fp f close fp out it is like what if I do not behave decently with my friends they will be angry with me but since they are friends they will not shout at me that is exactly what c plus plus does it becomes angry at you but then knowing you it closes the files itself and then says ok next program please however there could be problems the errors could be such in your program that when you quit there could be a lot of orphan file handlers remaining in a major performance problem in one large insurance company we found that thousands of thousands of temporary files were being created but not were getting deleted and then the disk was getting filled up not by contents of the file but just by the entries for those many crores of files so funny things can happen it is better to own up the responsibility for the mess that you are creating so if you open a file you close a file period when you execute this program funnily you will just get this notice this so different from ordinary execution of the program when you execute a program you type input you get output here there is no input no output it just says input file read so in order to figure out whether the output was generated correctly or not what you should do you should say cat studentdv.txt or less studentdv.txt or g edit studentdv.txt somehow you have to see this is how the file will look like notice that I have insulted so many blanks to make this string 29 this is how the file will look like ok ok why am I showing you this this is a spreadsheet most of you would have seen a spreadsheet we use spreadsheets for example to enter your roll numbers and marks and batch numbers and other things this is how a data has been put in here why am I showing you this ordinarily when you are dealing with large data you will be creating data very often on such spreadsheets now the spreadsheet data is stored internally in a completely different format but if you want to process it with c++ programs you will have to convert this data by saving it in some kind of a text format the most common way of saving this data in text format is called comma separated values format or dot csv format this is a text file every spreadsheet is capable of permitting you to save the data in text format for the sample data which is shown in the previous slide this is how you will get so roll number name comma batch number marks roll number name batch numbers marks there are only two differences between this and the previous data what are the two differences first after the last field which is incidentally a floating point value now because it is fractional there is no comma here there is a new line correct second the name itself could consist of multiple portions first name middle name last name whatever that means when you process this string you will have to do some exercise in actually processing separating out these parts etcetera again text file processing is largely to do with string processing problem the file handling portion is very straightforward in at least the simplistic terms that we have seen we will see more complex things later so some of you may want to try to see if such is the data that is given whether it is given in a file or whether it is given as an input you can use redirection you should be able to handle this and make for example find out what are the average marks of the whole class what are the average marks per batch etcetera etcetera that is all