 So, we are going to discuss the character data handling and for which C C plus plus just as most programming languages provide for a basic data type called CAR. We are going to look at the representation of character data, the basic CAR data type in C plus plus, the notion of ASCII codes, printable and other characters and more particularly the important relationship between character data type and integers that we have already studied. We will then look at the string representation as an array of characters. C plus plus actually provides another abstract data type implemented through its class libraries called string. That is a different concept altogether although most of the textbooks on C plus plus including Kuhun for example, we will start with the discussion of string which is an abstract data type. We are not discussing that. We are discussing the basic data type called CAR and we will discuss how to handle strings using that CAR data type. Later, when we introduce ourselves to the object oriented concepts, we will of course discuss the string class and various other classes. In particular, we will look at manipulation of the character data type including strings which will get into the arrays which will use to represent characters. We will have some examples. Instead of review of the operating system features, the directory structure and navigation because I notice that some people still have problems. I have already mentioned redirection I think in the last lecture. I will just complete that discussion and today we will discuss a take home exam. The exam papers are just coming. It is a novice scheme. You would not have perhaps given such exams except while preparing for JEE or some such thing but you are supposed to actually time yourself, write the answers to this take home exam, evaluate it yourself along with a colleague and submit the papers next week. So we begin our discussion with the basic character data type. Characters as we see them, when we are typing our program for example and storing it in a file or whenever we are giving input from the keyboard, we actually deal with characters. So for example, when my program says C in N some number and I give a number let's say 725. So what I type is not a value 725. I am typing three different characters 7, 2 and 5. These characters are somehow being taken in by the C in statement, recognized as three independent characters first, recognized that they actually individually represent numerical digits and then somehow they are combined to form a single numerical value converted into an appropriate binary representation and stored in the location forever. So what we treat as a value externally actually is a set of characters. Obviously if C can handle these individual characters as 725 and do something with them, we ourselves should be able to handle these characters. We further recognize that it is not just the numerical digits that we have to handle with. We also are required to handle other things. For example, we had observed in the last class that when we want to represent roll numbers and marks, in our particular case the roll numbers are not numeric values. They have a character D sometimes. So we have to treat the roll number as if it is made of nine symbols, nine characters, not a single numerical value and we were not clear how we could do that. To permit such handling, a CARE data type is provided by most programming languages and the origins of this CARE data type go back long way ever since the symbolic languages, programming languages came up. As you will recall, we had mentioned that the first set of computers were programmed using machine language and in fact there was no external programming or compiler or something like that. People actually used switches to set zeros and ones and that is how they wrote programs. They actually gave input using those switches only. It was a very early company. But the moment compilers came up, what is a compiler after all? It is reading your program and it's translating it somehow. When we say it reads your program, what does it do? Your program is in the form of a text stored in a file. So the compiler is actually reading character by character, all the symbols that you have written in your program. Therefore, character handling has always been very fundamental to any programming language. Very early, some kind of codification was desired for representing characters. What is a code? A code is an agreement between different people that when I put a value like this, this will represent such and such symbol. In fact, when we speak about written languages, the alphabet that we write, the script that we write is nothing but a code. It's a visual code which is agreed upon by all. Otherwise, you will not be able to read what is written here. When I write C written like this, H written like this, you can understand it as C and H because you all are familiar with the same common code that I use. Namely, H should be written like this, R should be written like this, et cetera. For computers, the code has to be numerical values. Very early, a coding scheme was designed which was called American Standard Code for Information Interchange denoted by ASCII. This is the dominant encoding scheme. The original ASCII code represented various symbols using seven bits. Subsequently, it has been made into a standard 8-bit representation. You will recall that 8 bits constitute a byte and byte is the smallest unit of addressable memory in most of the computers. Consequently, one byte can be used to store a numerical code value for a symbol. Now, if you use ASCII code, the ASCII codes are therefore defined for all possible values. How many possible values are there in a byte? Lowest is 0. Highest is 255. So, there is an ASCII code table. Any textbook on CC plus programming or for that matter, any programming book that you refer to, usually it will have an appendix. Kuhun's book, for example, has an appendix at the end which describes the ASCII table. What does it do? It gives series of values, 0, 1, 2, 3, 4, 5, 6, 7, 8, etc. And against each value, it writes the character that that value represents in ASCII code. Here are some examples. A blank. So, there is a blank here in between the single apostrophe. This is encoded as 32. The symbol plus is encoded as value 43. Capital A is encoded as 65. Capital Z encoded as 90. Capital A is encoded as 97. Sorry, small a. Small Z encoded as 122, etc. These codes were adequate as long as you were representing symbols from the languages which had these kind of scripts, crylic script or roman script or something like that. But when you come to other languages, take Japanese, for example, take any one of the Indian languages. You know, Indian languages are not made up of symbols which are written individually for consonants and symbols which are written individually for vowels, and then you write them by juxtaposing them against each other. In Devanagari, for example, if you were to write ka, you have to first write ka and you have to attach to it a matra a. The number of symbols that you need to represent, therefore, for representing appropriate meaningful language constructs in such languages exceeds 255. In order to accommodate such language representation or character representation, a new code was devised which was called Unicode. It is increasingly becoming popular and in fact text files which are exchanged now between computers are invariably coded using Unicode. That means a character is represented by 16 bits and not 8 bits. So, two bytes are required to represent a symbol and that, of course, gives you a much larger bandwidth sort of or much larger domain or range to represent different values. CC++ also permits representation of Unicode text. It has a character type which is not called ka,r. It is called w underscore ka or wide character because it is not one byte but it is wider, it is two bytes. We will not discuss that. In this course, we will be limiting ourselves to the discussion of ka data type. So, is this clear? Every character that we write has a symbol. In fact, I mentioned that suppose on the keyboard, I type the value 927. The computer will not read that as a single value 927. It will read 9 as a character like this. And this character 9 will not have an internal numerical value 9. What will it have instead? It will have the corresponding ASCII code for this symbol 9. Similarly, there will be ASCII code corresponding to symbol 2 and there will be ASCII code corresponding to symbol 7. So, your C in statement will be reading three ASCII codes whenever you type 927. As a matter of fact, in your program, for example, forget the C in but even if you were to write x is equal to let's say 1583. For you to write the value 1583, the value 1583. And if it is in text, that's the value which can reside very easily in four bytes or add one more digit, for example, 91583. The value can still reside in four bytes in two. However, in your program when you write this and when compiler reads this statement, it reads the character x, it reads the character equal to then it reads a separate character 9, separate character 1, separate character 5, separate character 8, separate character 3 and separate character semicolon. And each one of these is represented internally by an ASCII code. Consequently, when the compiler is looking at this, it is first looking at five bytes of information, each corresponding to ASCII code. And then it determines that this is actually a numerical value. It converts it into an internal by the reform and stores it. So, is that clear the notion of ASCII code and so on? Okay. Now, I mentioned that there is an implicit relationship between character type and integer type. That's the reason why CC plus languages call the care type as belonging to integral type. Integral type meaning numerical value type. And that is because we know already that one byte which will store a care, which will actually be storing an ASCII code for that, which is a number, a number between 0 to 255. It is not just that it is a number, but that CC plus class interprets that as an integer number. And therefore, that value of a care data type, which incidentally occupies one byte and not four bytes. So, it's the smallest possible integer value that is handleable by CC plus class. And that value is interchangeable with any integer value. That means just as you can add two integer variables X and Y, you can also add two, one integer variable X with a character variable CH. Whenever you give such an addition, the CH will be internally sort of not really converted but treated as an int and added. So, you can subtract numbers from a CH value. Of course, what results you obtain will depend upon whether that operation is meaningful or not. For example, you take an ASCII code of a character CH and subtract from it, let's say 2000. Some arbitrary negative value will result, which is not representable in one byte and you will get some very funny result. So, as long as you are careful and remain within the limits, now the limit is one byte. And it is an unsigned integer. Remember, a signed integer would use first bit as a side bit but that does not happen here. So, essentially, char is a numeric positive integer between 0 to 255 and arithmetic and relational operators are defined for these. So, let us write down, let's first look at the relational operators. Incidentally, a character constant is always written in single quotes, not double quotes, single quotes. So, a for example, is a value of ASCII value of a less than b, suppose such is the condition, then this is always true. That means the ASCII codes are actually designed to retain the lexicographic ordering of various characters. Small a is smaller than small b, ASCII code for small b is smaller than small c, etcetera, a to z. Similarly, capital A to z. Similarly, digits 1, 0 to 9. So, when I say single quote 4 greater than single quote 3 is true, it is not because the numerical value 4 is greater than numerical value 3. But it is the numerical value of the ASCII code for 4, which is also greater than the ASCII code of 3. We can have a comparison of the type where two values are say capital T here and a percent symbol here. Since the ASCII value for capital T is 84 and percent symbol ASCII value is 37, 84 is not less than equal to 37 and therefore this is false. So, this is, this will be the reason. What about small letters and capital letters? False. That is because the ASCII codes for all small letters happen to be larger than the ASCII code for all capital letters. This you can verify from any ASCII code table as I just mentioned. Some people apparently know about this, so that's ok, but that's the fact. Now, there are occasions when we have characters which are not visible to us. We still want to represent them. So, how do we represent for example, a tab character? A tab character which has an ASCII code of 9. A null character, a null character, the special ASCII table is for 0 to 255. So, there is a value 0, ASCII code 0. What does it represent? Funnily enough, it represents nothing. So, there is no corresponding symbol for ASCII value 0. No corresponding symbol. Every other ASCII code has some corresponding symbol. 0 has no corresponding symbol. That's why it is called a null character. Null character means it does not represent anything. As we shall see later, this null character is used for tremendous benefit in processing strings in C++. There is a line feed character. If you have seen the typewriter, whenever you are typing, the carriage which carries our paper moves leftwards, moves in one direction as you type. When you come to the end of the line, you do two things. You first pull a river by which the paper moves up and then pull the carriage back so that it comes at the beginning of the next line, first position. These two operations are called line feed and carriage return. So, when you press a lever, the paper moves up. It is called line feed. You are feeding one line to the paper so that the next typing will occur on the next line. And when you bring the carriage back, it is called carriage return. These two have been absolutely standard operations for decades before computers came in. And since you had to type and use text and exchange it with the computer, it was necessary that you have the notion of end of line. You have the notion of going to the first position of the next line. Correspondingly, CR for carriage return and line feed for next line have become two important symbols in any report type of coding. ASCII codes define the line feed or new line character called LF as ASCII 10 and carriage return CR as ASCII 13. In all unique kind of systems, wherever you write your text files, the new line character or LF is considered to be not only just end of line but it also forces you to go to the first position of the next line. So, LF alone is sufficient to indicate both operations, line feed and carriage return. However, there are systems such as Microsoft operating system and related software which use carriage return followed by line feed as a composite symbol for going to the next line first position. That's called CR LF. Why I am mentioning this is that often you will be required to exchange files, a text file which is prepared on one system and you are taking it to another system and trying to understand or trying to read it. Now, on one system, if the system has inserted CR and LF characters after every text line and when you read it here, you have only line feed which is recognized and this system does not know what to do with carriage return. So, there are consequently translating programs which do that. Almost all word processors are able to do these translations finding out what the system they are using and appropriately insert symbols. But the point is that none of these symbols are visible to you. They are not visible here. When you write, for example, I type this line in this presentation slide or if I type a line in my program, I was talking about what appears on the screen when at the end of this line or end of a line or a program, when you press enter, nothing happens, nothing appears. Just the cursor goes to the next page. But what is happening is a character is being inserted there and that is called line feed character. Now, it is quite possible that when you are handling text files by a program, when your program is required to read characters from an input stream, then you would like to handle that line feed character because it will be there physically. How do you represent that in program? It has an ASCII code, but you do not write ASCII codes in single code marks. So, there has to be a mechanism to represent such symbols. A blank incidentally or space is considered a visible character. Its code is ASCII32. It is visible because you see the space physically on the screen. So, it is considered a visible character. A tab is also that way a visible character, but the interpretation of tab could be different in different systems. In one case, it may shift by eight positions. In other case, if you have set appropriately the tab positioning, it may shift only by two positions. So, the tab character also needs to be handled spatial. C++ provides a simple way of handling such characters by using what we call the escape characters. First, we talk about the explicit literal characters which are written in single code. So, small a, capital D, star, that is how you write them. Ordinarily, every character will be written as just one character enclosed in single apostrophe on either side. But when you have spatial characters, there is a special mechanism. This special mechanism is provided by what is known as an escape sequence in C++. An escape sequence is a set of two characters, the first of which is backslash. Backslash is a visible character, you know that. So, if in a single code mark, if you put backslash followed by T, then it is not considered two characters, it is considered a single character with a special meaning. This T, for example, does not have a special T, but it denotes a tab. Backslash N denotes a new line. So, it represents the equivalent of your pressing return. Now, can you relate this backslash N to what you have been observing in our see-out statements occasionally? Instead of end L, we write backslash N as the last character of a string. That is nothing but a new line character. Similarly, if I want to denote a single code as a character, I cannot enclose a single code in single codes, because then the first two codes will collide with each other and the third code will remain hanging. So, you write backslash code there. Similarly, the null character is represented by backslash 0. We will have a large number of occasions while handling text input to deal with backslash N, which represents the new line character of backslash 0, which denotes the null character. Time and again we will meet these two jokers in our life. So, a literal string constant. Now, that is formation of strings. A literal string constant is a sequence of zero or more characters enclosed in double code marks. Now, this is something that is familiar to you. Observe that all the error messages or guidance messages that you write for collecting input from user have been written in strings like this. You agree? Double code, something-something, double code. That is what you see in all your see-out statements. They are nothing but string literals or string constants. So, a string constant is written, double code, series of characters including whatever character you want. If it is a visible character, you type the symbol like blank, e, v, etc. Insert a special character, put a backslash and put that code. That character will come there. This is a double code delimited string. And a string consists of a series of characters. So, here are some examples. We are even loonier than you think. It is one string. Rust never sleeps. Backslash n. What does this backslash n mean? It means a new line character. Prefix a double code in a string with a backslash. What does this mean? See, this double code starts the string and the last double code ends the string. However, if I want to insert a double code itself inside the string, how do I do that? Because the moment I write a double code, it will end the string. So, that is why I use this escape character backslash. And backslash double code means the double code itself has a character to be inserted in that place. It is not to be treated as the end of the string. Now, string is not a fundamental data type. So, it is alright whenever I am writing constants like this in my see-out statements. But what if I want to handle strings in my program? I want to handle individual characters of the string. I want to know how long a string is. I want to know how many characters are there in that string. I want to know what is the fifth character of that string. There is no way I can do that unless I can reprint that as value of some data structure inside my program. C++ provides for such values to be handled because I might need to input values, output values, storing memory, manipulate. So, string values are handled in a very special manner. They are handled by using an array. So, for example, I can declare the care data types such as I can say care pq, I can say care later equal to c. Just as I declare int and float variables, just as I can initialize int and float variables, I can initialize a care variable. Of course, the value that I should give is a character type, single quote, some character single quote. Observe this array, care name 20. This defines p as a single character, as a matter of fact, q also as a single character, later also as a single character value, and it defines name as a 20 element array, each element of which will contain one care value. So, consequently, this array can contain our string containing at most 20 characters. However, there is a problem if we stuff characters of our string into all the elements of an array. We will look at a program to find out, to get, read a word inside our program, reverse that word and print it out. But before that, we must consider the spatial manner in which c plus plus treats character arrays as strings. This statement is implicit. It is not explicit, because as I said, string is not a basic data type in c plus plus. Care is the basic data type. So, what does it do then? Strings are sequence of characters. So, we use arrays to represent string. Consider two strings, say Vishwanathan, Anand. These are two separate strings, two separate names. Now, I might define two care arrays, n1 and n2. Let's say I define them as 100 elements each. They are all type characters. I can now keep the letters or individual symbols of each of these two strings into the consecutive locations of these arrays, n1 and n2. So, I have shown an example of how I could fill up the array n2. So, I can say n2 0 is a, n2 1 is n, n2 2 is a, n2 3 is n, and n2 0 is d. If I juxtapose all the letters stored in individual elements of these arrays, I form the word Anand. And therefore, I can say I have stored the word Anand into the first five elements. I am sorry, there is a mistake here. So, n0, n2 0, n2 1, n2 2, n2 3, n2 4 will contain the five letters of the word Anand. I could do the same thing and I could put Vishwanathan in n1. Now, the problem is in my program I have declared n1 and n2 to be 100 element arrays. How do I know that meaningful letters that I have put inside the array have ended? First of all, we know that if there is a 100 element array, each element contains some value. If it is care type, it will contain some value in that byte. And if I am not careful, whatever is that value would get interpreted as an ASCII character. Suppose somebody using my memory before me has put in some values in some memory locations which amount to the ASCII code of Z. Now, if I do not know exactly how many meaningful characters have put inside the string, then I might interpret these two names as Vishwanathan Z, Z, Z, Z, Z, Z, Z or Anand Z, Z, Z, Z, Z, Z, Z. All of that will be meaningless. Consequently, I must have my own coding system to determine when a meaningful string ends. C++ defines such a coding system in terms of inserting a special symbol at the end of the string which is called the null character. Remember, I told you the null character and the new line character are two symbols which will keep coming across again and again. So consequently, if at all I have an array of care, then I can put the string characters into those array elements consecutively. Whenever I am done, I will put a backslash 0 as the last element. Consequently, to represent a five-character string, I will actually be using six elements. First five characters containing meaningful symbols and the sixth character as backslash 0. Now, it does not matter what my friend has written into the memory previously. Take the same example, lots of Z's in the memory location. But when I write the name Anand inside that array, I will have A, N, A, N, D, backslash 0. It is up to me to ensure that when I scan that array, the moment I come across backslash 0, I stop looking further. I note that my string has ended. This is what I have to be careful about while writing my programs and handling character strings through arrays. There are a whole lot of functions pre-written in C++ library which we can use. All those functions make exactly the same assumption that if you pass a string name, an array name as a parameter, then it assumes that that array contains a string in this sense of the word. That is as many meaningful characters as you want. The moment there is a backslash 0, the function will assume that the string has ended. And so must you in writing your programs. So, in a nutshell, if we are forming strings within our program, it is our responsibility to ensure that a null character is inserted. So, once we do that, we can check for backslash 0 in a string. We can determine when the string has ended. There is a question of input and output over which we have no control. What we do inside our program, we can handle. It so happens that this statement that we have been using so far, C in, and the statement that we have been using so far for output, C out, both can handle words and treat them or read them into character arrays or print out the character array containing a word. So, this feature is implemented in the operator called C in and C out. Later on, we shall see major functions for handling such input and output in terms of scanf and printf. We shall do that after the mid-sev. But as far as we are concerned now, C in, whenever we define an array n, let's say n is a care array. So, it's a care 100, care n 100. Whenever I say C in less less n and during the execution, if we type an n, C++ will assign characters to element of n, including a backslash 0. Now, how does it know where to end the input? Well, that depends upon how you give the input. First of all, you have to write anand as a single word without any intermediate blank. C in, the input statement that we have been using is extremely sensitive to blanks. The moment it sees a blank, it presumes that the input for the current component has ended. That is why when you read two numerical values in say x and y and you say C in less less x, less less y, you can actually give two values in the same line separated by one or more blanks, say 25 and 35. And it actually assigns 25 to x and 35 to y. That is because it uses blank as a separator of information. Exactly the same thing even when you are dealing with characteristic. Later on, we shall see the built-in functions which will permit us to read even the blanks in the input line. But today we are not looking at that. Consequently, when the input statement of this kind is executed and n is an array, let's say array of 100 elements and if I type in the word anand, followed by a blank or new line character, any white space in fact will terminate that input. Then it will put a in the first, 0th element, n in the first, a in the second, n in the third, d in the fourth and a backslash 0 in the fifth element. So the backslash 0 is inserted by C in. Similarly, if you said C out n, now n has 100 elements, but what the C out operator does is, it starts printing out characters one by one from this array. The moment it encounters a backslash 0, it stops there. It does not output backslash 0. So input and output is handled and we know how to handle the storage and we also know how to manipulate. After all, we can compare any element of this array with either a numerical value or a character value. We can add two values, we can subtract from something because what we have is actually an integer. The fact that it is a one byte integer should not worry us as long as the values that we obtain by arithmetic operations are within that range of 0 to 255, they will always represent a valid character. So here is a program which will input a word and output its reverse. I am declaring char name as a 10 character array. I give a guidance, type a word, no space in between, C in n, C out word l. So here what I am doing is, I am reading the number of characters in the word that you will give. Then I asked you to type a word, no space in between. I will read that, I will take that word in and for i equal to n, i greater than 0, i minus minus, I will output the word in reverse fashion. Notice what I am printing are individual elements of name. This is a particularly concocted artificial example. Let us notice the inadequacies in this example. The inadequacies are, can you get the inadequacies? Suppose I type the word anan, anan has only 5 letters. So what will happen, what will this program do? I have given n as 5, I have typed the word, 5 lettered word. I will start looking at nth element. What is the nth element? Yes, nth element will be backslash 0. Notice the imraglio, this program is highly muddled. It is deliberately, so I want you to understand what exactly is going to happen here. Yes, first of all what should I input? Let me say that I have to type in the word anan and I want it to be printed reverse. Now the word anan according to normal human counting is a, n, a, n, d, 5 letters. So I will give n as 5. When I give n as 5, I am now reading this word, n is 5. I have read the word, c in greater, greater name. What I have mentioned here is the name of the array. As per our rule, c in will read the name a and a and d which is followed by space but that space will not be read. It will store these 5 characters that I have typed in elements 0, 1, 2, 3, 4 and the next element will contain what? A backslash 0. Which element will that be? The element number 5. What is the value of n? n is 5. I am now looking at i equal to n and I am outputting name i minus 1. Do you see the reason why we are doing that? Because we know now that the nth element, notice that the arrays start with 0th element. So nth element is expected to be a backslash 0. So I don't print it. I start my count from n. I reduce the count i by 1 every time but I stop. I continue only as long as i is greater than 0. So when i is 1, I will execute this loop the last and therefore I will be printing name 0 as the last value which is the correct first character in reverse. And then when i becomes actually equal to 0 this loop terminates I print an end of line and kaya come out. So apparently this program is running well. But does this program make eminent sense to us? After all when I want a word to be printed in reverse order is it proper for you to ask me first count all the letters in the word then input that number and then put this value that is 1. Second what happens if I say n is 5 but instead of typing an I type Vishwanathan. Now I have typed Vishwanathan. The word Vishwanathan has been read into this array. This array is not long enough to hold Vishwanathan is it? So what will happen? First of all there will be no backslash 0 inserted because there is no space for that in the array. So what does C++ do in such cases? That is the question I have asked. What happens if the type word has more than 9 characters? C++ actually takes the 10th character puts it in the 10th position. Takes the 11th character and puts it in the hypothetical 11th position namely the next consecutive location of the memory. That next consecutive location of the memory was for example first byte of a floating point variable why in your program? Why is permanently chewed up now? Not only that it will put that backslash 0 the last artificial character it would generate and put in some hypothetical location. This array will not contain any backslash 0 in fact this array will not even contain all the meaningful characters. Now I start printing backwards. The value of n is what? 5. I made the mistake. I gave you value 5 but I typed Vishwanathan instead of Anand. Now what will it print? Can you guess? It doesn't matter where it gives this and where it gives backslash 0 etc. but our printing will start from this A which incidentally is not backslash 0 but it does not matter because we are explicitly counting and printing. So what we will get printed out will be W, H, S, I, V. Agreed? This is not reverse of the number that I give. So this is an example of good intention but bad programming. So how could we write our program? In order to write a proper program first of all we notice that it is stupid for my program to ask you to type in the number of letters that a word has. Why should I do that? I have a 100 character array. Let you type whatever word you want to type. So I will first write a program which will find out what is the length of the word that you have typed. How can I find that out? Here is an example. I ask you to enter a string without spaces and I read this in an array called str1 which has 100 elements. Hopefully you will stop typing your word when you reach the end of the line. A meaningful word is not likely to be that long. That is the assumption. Anyway nothing prevents me from putting a 1000 array but whatever. So what will happen now? The input will be read here as a word and it will occupy as many bytes, as many characters are typed without a space and whenever I put a space or carriage return it would insert a backslash 0 at the end. I know that. I don't know in which position it has inserted backslash 0. This is where I can use the looping structure to determine that. I start with i equal to 0 in the first position because in order to cheat me you might have just placed return immediately without entering any character. That means you are entering a null string actually. So I will catch you on that also. What I do? I start with i equal to 0 and I now test while stri is not equal to backslash 0. Keep looping. So what am I doing? I am simply incrementing i. I am going from one 0th element, first element, second element. I am not interested in what you are typing. I am interested in locating where your string has ended and I know that backslash 0 will end the string. The moment I locate that backslash 0 I know that up to this point was my actual work and the backslash 0 is end of the string. Now what is up to this point? When will I come out of this iteration? I will come out of this iteration when i is such that the i th element is actually backslash 0. So what is the length of the string? Is that correct? You are falling for mistakes. Why? Take this example. I have typed this. This has been taken as input and now I am examining my i is equal to 0 here. This is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 and what is i when I get this? 12. So this is the 12th position where I notice backslash 0. According to that program I will announce the length to be equal to 11. How many characters does Vishwanathan have? 12. So what we forgot? We deleted one from i naturally thinking that if this is i then one less should be the length. One less is actually the position of the last character and we forgot that the array element starts with 0. Therefore the length should not be 11 but the length should be 12. So length should be i. Got the point? So now we go back to the reversal issue. We want to reverse a character string that we read. We have found the length of the array. Now having found the length of the array is it not possible for me to reverse the string? Here is a program which does that but it does something else. It actually reverses the string inside the memory itself. The array has been, the word has been read in str1. Now in str1 itself it will look at whatever word you have given and it will make it Ulta. Let's see how it does that. First of all to make sure that we don't have such goof ups in calculating length as we did just some time ago we can use a library function which is provided by c++. In order to use that library function which is called so here is the program which reverses using another approach. Notice an additional include statement at the beginning. This is apart from your IO stream. I say hash include c string. This c string and of course I have to have your standard space STD etc. IO stream all those statements remain but this is an additional hash include that you have to add. What does it do? It tells the gcc or c++ compiler that look I am going to use a function which is defined in c string library. So please include that as well and what is the function I am using? This function is called strlen which is short for string length. If you give an array containing a character string as a parameter to this function it actually gets you the length of the string and it gives you correctly the length of the string. So we need not have that goofy program. We could have written that by the way we could have written the corrected program and calculated the length that is not the issue but this is just another way of determining. So observe what we are doing. We ask people to enter a string they enter the string str1 let's assume that Vishwanathan is entered once again and let's say that it will therefore c in will give me the backslash 0. Now instead of me looking at backslash 0 I ask the function str alien to do that. It does precisely what my earlier program does. No different. But instead of my writing that much code I can make a function reference that. That is one of the objectives of writing function. Instead of writing large code you just make a function reference. Observe that I am saying length as str alien str1 minus 1. Will that be correct? It depends that is the correct answer. It depends upon what is written by the function strm. So going back to this if I have typed Vishwanathan then I know that the array will contain 0 1 2 3 4 5 up to 11 these symbols and the 12th element will contain backslash 0. The correct answer is depending upon what str alien returns this use of this expression will be right or wrong. So if strl alien returns say 12 then the length is actually 12. If str alien for some reason returns 13 then I have to subtract 1. If str alien returns 11 for some reason then I have gone completely bonkers. I will get wrong answers. This is actually a program from the next week's lab, a sample program. So I had asked my TAs to make some deliberate errors. You can clearly see that this is likely to be a deliberate error. And you are going to examine and correct yourself. But for the time being we assume that the length is calculated correctly. Let's assume that. So let's now concentrate on this part of the program here. Let's assume that I have calculated the length correctly. What is the length at the moment? It is 12. All that I do is I start with I equal to 0 and I keep going on till I is less than length by 2. And what do I do? I swap two corner elements first two elements next to them look at what I am swapping. I am taking the ith element and swapping it with length minus ith element. So if the length was 12 and I start with 0 in my first iteration I will swap 0 and 12th element. Then I am changing I to I plus 1. So then the next iteration I will swap 1 and 11th element. Then I will swap 2nd and 10th element. 2nd and 9th element and so on. Cross check whether this logic will work correctly. What happens for example if I have odd number of characters. No problem. The central character remains where it is I need not swap it with itself. I have to check whether this works correctly for even number of characters. And you can confirm that it will work. So do you now see how easy it is to handle manipulation of character strings now. You can read any kind of text do whatever you want to do with it. Print it. You can have two strings read in two different arrays. So you can read for example Deepak Phatak has two names and print it as Phatak Deepak. You can do absolutely anything that you want to do with strings. You have to just remember two things. String is not a basic data type in C++. It is not. It is a concocted data type. It is an artificial data type. However in C++ string is a special class an abstract data type defined by class libraries which we shall see later after the mid set. As of now we do understand that care is a basic data type. An array of care can be used to represent strings and artificially we put a backslash 0 at the end of the string and with that we can handle strings any which way we want to help us. There are a whole lot of string functions. Just as I told you to read up the ASCII code table from the appropriate appendix in a textbook you can also read up in any textbook the library functions of C string which are given there. All those functions what do they do that is where you will find out str alien returns what and of course any time you have a confusion you can write a small C++ program call that function with some known string and find out exactly what that function does. Here is a small tip on debugging. I think I had mentioned to you that now onwards your programs will become larger they will become more complex and when you write complex logic in large programs sometimes you are not able to find out what is the logical error the program is not giving the desired output. So what do you do suppose you have a 100 line program lots of nested ifs and whatever you will actually put some print statements in between see out there see out there just to check that when the program proceeds whether the logic is being followed correctly or not such statements actually are called debugging statements they will help us to identify errors debugging because an error is called a bug and removal of that bug is called debugging so attempt to remove that error you want some additional information as the program executes one way is you hand execute the algorithm that you can do for small program but large programs you can't do that now imagine that you have written a 100 line program and in order to debug it properly you were required to insert 20 see out statements in between finally you figured out what was the mistake you corrected your if statement condition whatever whatever and now your program is working fine but now if you want to give that program to someone else would you like to give that program along with those 20 debugging statements so what will you do you will now remove them so you will keep removing one by one every statement if you keep because when your friend runs that program you don't want that program to print intermediate debugging statement outputs at all you want that program to work take the data and print the final output in fact even if you want to execute that program again and again for different data you don't want those outputs to come again and again you want therefore a facility by which you can write such see out statements which are not actually the required printing statements for your final program but you would like to include them while you write a long program and somehow you want to tell the compiler system that look I am now done with this run my program but don't print output from these statements in short you want to distinguish between some see out statements which you have written for debugging and those see out statements which are legitimate output statements of your program how would you do that for that you have the debugging facilities which all compilers provide this is the facility which is provided by C++ or GCC system of compilers so as I said ordinarily once you correct these errors you will like to remove such instructions as debugging instruction so what you do is you use a very special kind of if and end if this is a special block this special block is written by writing this in a fresh line has symbol if a blank and a name of your choice I have used xxxx here any statement that you write here not only see out statement but any statement that you write here is called the statements within this special if block which is ended by the word end if why do we write such blocks well now I suppose you could suspect what would happen there must be a mechanism for me to tell the compiler that if I am debugging compile these statements as if I have written them in part of my program but when I tell you my debugging is over ignore all the statements within this block would that not serve your purpose xxxx is any name tag of your choice I prefer a name tag called not sure because I am not sure what is happening in my program so I use the word not sure you can use any one another popular word is debug if debug that means if I am debugging so when I put a block like this for example I say hash if not sure see out i blank j and sum ij let's say I am doing we have not yet seen a two-dimensional array but this is a representation of two-dimensional array element and I am not sure whether the sum is being calculated correctly or not so I can print this I can have any c output state this statement will ordinarily be ignored if I put hash if not sure and hash end if anything that I write in between will be ignored however if I compile my program by using this statement c++ minus capital D followed by not sure proc.cpp then the compiler will assume that this hash and hash end if does not exist it will simply take these statements to come in the same logical order where they are written in your program and will compile all of them consequently if you compile it with minus d option this output will be actually produced as and when these statements come in the given sequence but if you don't use minus d option then these statements will be ignored so you got a very easy way now in fact in the larger programs this is the sure short way of debugging there is a still better way a more complex way called using a debugger so there debugger is a tool what it does is it permission to take control of the program execution and through that tool you can tell that program to execute so many iterations show you the value every type of value of a variable changes etc etc that more complex thing we shall be studying after the mid-sem but you can use this for debugging your program okay so now let me come to the conclusion of this course meanwhile let us look at this and tell you the objective of doing this you see when I said that I won't be conducting the quiz on Wednesday I was very sure of using clickers for collecting quiz answers from you however I forgot that a quiz conducted normally on paper and pencil serves another purpose the purpose that it serves is that it acts as a dress rehearsal for you for the mid-sem exam you generally get to know the kind of questions that are asked you generally get to know how much time you take to answer a question and you get a feedback on the quiz evaluation so you generally know that in one hour or two hours how much work you can do in an exam kind of situation unfortunately by not conducting that quiz you are deprived of such dress rehearsal so I have thought of this novel way of conducting a take home exam obviously I can't give you marks for this take home exam but I want you to benefit from this exercise exactly the way you would have benefited by appearing for an exam so here are the rules a sample exam paper is being issued you will be collecting that paper now the expected time to be spent on this paper is two hours it is exactly like a mid-sem exam so this is dress rehearsal for a mid-sem exam now these are the rules you are required to solve this paper on your own before Sunday 6 p.m. so consider that Wednesday morning quiz which we cancelled is now being conducted for two hours of your choice you can choose anytime midnight evening morning whatever whatever what is required is you do it as if you are giving a paper so you write your roll number, name and lab batch on the top right corner note the start time you can note it either on the paper or somewhere else and when you finish note the end time and write the duration assume that because you are doing it not under a supervised circumstances suppose suddenly a friend rings up and you get out and talk to the friend just note down the time when you leave the paper say you spend 10 minutes, 15 minutes, half an hour, one hour you come back again and start the clock again the idea is this is not for me or for the sake of evaluation the idea is you get to know exactly what you can do it in two hours now here is an important point please I believe this is extremely important and please listen to this this is an open book exam and this is not just an open book exam this is an open paper exam because paper is being given to you right now so there is absolutely no problem if you study from the textbook or from the slides or from wherever the equivalent concepts which are required here because the objective is actually that you gain confidence in correctly answering the questions I would ideally like each one of you to score 100 out of 100 marks but these 100 out of 100 marks must be scored on your own attempt within that timed period when you start writing the paper you don't consult anybody you can use books, you can use your notes but when you do that do it proper my suggestion is before you can glance through the question paper just to understand what that question paper is the general type then you do whatever study you want to do as if you are preparing for the exam and then sit down for two hours and try to write this answer this will be extremely beneficial after you do that exchange paper with a colleague and get it evaluated so you do self-evaluation not your own paper you can do that if there is no colleague around but ideally you should exchange it with someone from your own lab batch I will tell you the purpose I will be putting up the model answers for this paper by Sunday evening 6 o'clock that is because Sunday evening 6 o'clock is the time by which you must finish writing this why Sunday evening is because the other batch the slot 11 people will finish the second lecture only Saturday afternoon because they did not have a class on Tuesday so they will probably start only after Saturday evening and therefore I have to give them some time next you get that paper and submit it to your TA but before doing so discuss that paper and your answers with your colleagues in the lab batch this is also extremely important this will span over the entire week next week as you know now your sitting arrangement has been changed in every lab that you go you will be sitting with your colleagues with whom you will be working over the next eight weeks of this course they form a lab batch which is also the batch for your project so we will be forming smaller teams within that group and I will tell you how to form the teams later which is absolutely important that you know each and every person of that group of 12 or 13 people and your lab TA who will now be associated as your teaching assistant for rest of the course so get that paper discuss it just to understand if there is still some sort of haziness in your mind but do submit that paper as I said this is for record even if somebody gets only two marks out of 40 and I will announce the marking scheme later it does not matter it will not affect your performance in any way however if an honest attempt is made and if the TA and me are convinced that the attempt is honest then for every honest attempt there will be a bonus one mark given ideally every one of you should be able to earn that bonus one mark this bonus one mark will be added to whatever total marks you score since the grades in this large course are often decided by a mark difference of 0.3 or 0.4 that one mark could change your grade or it could elevate you from A to AP it is all up to you I must be convinced however that the effort is genuine and that genuine effort of course could also include I write my name, roll number, I sit for two hours and I do not do anything and I give a blank paper that is also a genuine effort I would be extremely upset if such genuine efforts are submitted I would expect the objective of this is you actually try to do something useful and provide so is that agreeable this is a novel experiment I have never tried it earlier in an undergraduate course I have tried this many times in my postgraduate course but I figure that there is absolutely no difference after all they are people you are people and you should be able to do that so do you think this is a worthwhile exercise? fine so on your way out please collect a copy of the paper it is kept on either side as usual so this is what I told you I will be putting this up on the Moodle today evening after the lecture for the slot 11 people but this is one very serious request I wanted to make I have been visiting the labs myself in the last few weeks and what I find is that people mostly tend to download the sample programs, compile them mechanically run them with some values and announce they are done the objective of the lab is not at all that the objective of the lab is you actually study those programs you check whether there are any variations possible you check how you test those programs what inputs you would like to give and you contemplate on the effectiveness and efficiency of that program before you go further obviously the time is less and since we have submissions of the assignments what I have noticed is every person is eagerly waiting for that assignment get that assignment submit it and be done with it there are then some people who have problems in navigating through the directory structure copying files appropriately making tar etc etc it is therefore essential that you spend some non-trivial time before coming to the lab in preparing for the lab believe me otherwise you will not benefit consequently what we have decided is the lab for the next week will be uploaded the handout and all the programs will be uploaded I am saying Sunday 6th 1800 hours is because some of the sample problems sort of give hints as to how you answer the sample exam paper that I am giving so that is why a time limit for you to finish the exam is 6 o'clock Sunday evening and because 6 o'clock onwards the handout for the next lab will be available along with the sample problem now I want you particularly those who have labs on Monday to study those programs on Sunday and this time we have deliberately introduced some errors as I showed you in one case there is one of the sample programs so you will have to study and find out logically whether that program is solving that problem or not and syntactically whether that program is correct or not then you should decide how will you test those programs what extreme values that you will give and what will you note as results in your lab diary please do this to benefit maximally from the lab ok so that is all I wanted to share with you