 So, to begin with, if you have seen those lectures, then you know that C++ permits handling of arrays in two ways. One way is the traditional way in which the strings are first introduced in the C programming language and which continued as is in C++ through a special library called C string library. Other method is to treat strings as objects. Now, we have not yet studied object-oriented programming concepts. When we do that, we shall see how strings are handled as objects in C++. Currently, we are going to look at strings as they are handled in the traditional way, namely as an array of type care and you store any string in that array. So, in a care type array, when you store a string, obviously a string will have unknown number of characters. Depends on how many characters you stuff in the string, 5, 10, 20, 100, whatever. So, how does your program know when a character string has ended? For which the C programming language introduced a funny convention which continues till date. Namely, you put a special symbol at the end of such a character string, a symbol which is unlikely to occur in a regular character string. The regular character string is usually a string of characters which are printable, which are ASCII codes, etc. So, the special character chosen was backslash 0 or the null character or the zero character. Consequently, in any array when you stuff characters in that array, when the string ends, you put a backslash 0 at the end. All C++ functions which handle character strings and the program that you write have to take cognizance of the existence of this backslash 0. So, whenever backslash 0 is encountered, your string has ended. The size of the array obviously should be such that there should be an extra space for this backslash 0. So, if I declare an array, let's say an array called word which is car word 10. What is the longest word that you can actually store in that array? Nine letters because the 10th position must be kept for the backslash 0. Of course, the actual string may be shorter. In which case, it will be known because the immediately next character will be backslashing. Now, how do you handle input-output of character strings? First, let us consider character strings which are contiguous set of non-blank characters. For example, a name which does not have a blank in between. If you have to input such strings, then such strings can be handled by your normal C in operator. Just as you say C in a variable, instead of a variable, you can give a name of an array which is actually a car type array. C in automatically understands that the string is expected. So, when you type characters on your keyboard, contiguous characters without any blank, the moment first blank appears or the moment you hit return, give a new line character, that string goes as an input and that value gets stored in the designated array. This happens automatically and C in does one more favor to you. When the input string ends, it automatically puts a backslash 0 at the additional character at the end of that array, so that it is a well-formed string. Of course, it is possible that we declare a car array in our program and we do not input any value into it, but we actually, ourselves, stuff characters in that array to form a string. If we do that, the responsibility of putting a backslash 0 character at the end of such letters is ours. So, we have to take care that a backslash 0 is put at the end. Here is a program which will determine the length of a string. So, this is the standard declaration. Observe that a character array is defined as word length 6. This is an arbitrary length. I have defined a variable word length with an initial value 0. I is typically an index that we use for for loops. I request for an input string. Notice the C in statement, C in greater greater word. It is a valid statement. Although I have given an array name, it is a car type array. So, C in knows that it has to read a character string. It will read that string, put it in word and put a backslash 0 at the end of characters that have been input. So, having read this input string, I can simply say for i equal to 0, i less than 6, i plus plus. It is a standard counting mechanism to scan all elements of an array as you are familiar with. If the ith element of the word is backslash 0, it means the string has ended. I will break out of the loop. If not, I will complete this loop. Ordinarily when I come out, whatever is the value of i will be the length of the string. For example, if I have input a 4 character string, then 0, 1, 2, 3 will be the positions where these 4 characters will be stored. The 4th position will store backslash 0. And when backslash 0 is encountered, I will be 4. I will come out here, which is the length of this string. Is this clear? There is also a built-in function in C++ which automatically does exactly this and gives you the length of a string. It is called str-alien. We shall see its use shortly. So, here is a recap phase. This is the program the quiz follows. Since you will need to look at this program when you are answering the quiz, it is preferable you note down this program in your notebooks. Everybody taken down this program? It's a simple program. It declares a char array str10. It reads an input into str. And then it simply prints the ninth element of that array or element at index 9 as simple as that. As long as you remember number 9, you are okay. All right. So, here is the first quiz. If I type the input as a b c d e f. So, I type a b c d e f and press return or press a blank. That value becomes the value of this string goes inside. What is the value of str9? Is it a blank space? Is it character f? Is it character backslash 0? Or is it that you cannot determine? Simple question. All have answered this. The second question is merely a variation. The input string that I now type is a b c d e f g h i j k l. And then press either blanks or press return. So, this essentially is the string which I have input. Same program is executed. The value of str9 now is either i or j or backslash 0 or cannot determine. All right. So, are you done? All collected? Thank you. Now we will go back and have a little discussion. These are of course simple problems. But they convey an important principle. So, first let us consider quiz 1. Recall that the program was very simple. An array of 10 elements was declared. Some string was given as input. And you are asked to tell us what do you think will be the element at index i equal to 9, str9. So, what is the answer for quiz 1? A. Let us do things by elimination. Why A is not correct? Suppose I had typed here a b c d e f blank blank blank blank blank blank. Would the answer still be D? Please note we are using the C in operator. The moment C in encounters a blank, that blank is a signal to C in saying that value has ended. So, whether I give 1 blank or 20 blanks, it will ignore all of them. It will take only consecutive non-blank characters as the string. That means the string which it has picked up is A, B, C, D, E, F. There are exactly 6 characters in that string. So, element 0, 1, 2, 3, 4, 5 will accommodate these characters. C in will put a backslash 0 in the next position. But we are asking you what is the value in the position 9? It clearly cannot be blank. It clearly cannot be F. It clearly cannot be backslash 0 because C in does not put a backslash 0 in all the remaining elements. It just puts a backslash 0 after that. So, what does it put in the remaining element? The answer is nothing. C in does not put any value. What it means is that the value inside those locations will be some arbitrary value depending upon what was present in those locations earlier, before your program was execute. Unless the compiler particularly initializes all such locations to a specific value, which C++ implementations generally don't. So, therefore the answer is cannot be determined. What is the answer to the next question? I, D, B, not C. Why B? The array has only 10 locations, 10 elements. By definition we expect a backslash 0 to be inserted by C in at the end of a string. The string which I have given is A, B, C, D, E, F, G, H, I, J, K, L. First of all, let us try and understand what does C in do? So, any guess what C in will actually do? Forget what will be there on the ninth element. What will C in do? Will it take only the first 10 characters? Will it take only 9 characters and put a backslash 0? It will take all the characters. And where will it put those characters? The array str is declared to have only 10 positions. So, where will it keep those additional characters? This is the array, right? Please remember C++ guarantees that it will not check array bounds. It is your responsibility to ensure that what you put in array is within the array bounds. So, what C++ does is, if C in gets an input string which has 12 characters, it will actually put this as A, B, C, D, E, F, G, H, I, J, K, L, whatever. Now, where are these locations? Do these locations belong to array str? No. Please understand that when I say C in str and arrays are internally handled by the first name which is actually a pointer. We have not yet studied pointers. Logically what C++ does is, it will put those characters one by one into consecutive locations, consecutive locations. So, after this location which is the last location of the array, there is a memory location in the memory which might have been allocated to some other variable of your own program. For example, if I were to say int m, then the next 4 bytes would be allocated to m, for example. But the C in statement and C++ does not care one who. It will take the ASCII value of K and put it in this byte. It will take the ASCII value of L and put it in this byte. And as an added attraction, it will put a backslash 0 in the next byte because that is the rule. So, what gets chewed up is any value of m that you might have stored there. If this was the last declaration in your program, then it might end up corrupting a translated machine instruction of your program. It was not uncommon in the early days when memory across different people's program was not protected from each other. So, if you are sitting in the neighboring terminal and I am sitting here and I want to chew up your program, I will simply declare an array a100 and I will say a-1000 equal to some trash value. Now that minus 1000 does not exist in my memory but it will go and corrupt your memory location in your program. Nowadays, it does not happen and the operating system generally does not permit a program to corrupt any memory locations outside that program. But what could happen will be an extremely funny. Moral of the story is it is your responsibility to ensure that once you declare an array, you do not operate on elements which are not declared part of that array. It is your responsibility. C++ does not take care of it. Is that clear? Consequently, in this particular location the character J will appear. And of course, if you now say C out STR, what do you expect to happen? The action will be indeterminate because STR is not a well formed string. But if the output instruction looks at all consecutive locations till it finds backslash 0, it might actually put the larger. In any case, the behavior is not predictable because it is against the C++ norms. Is that very clear? That means you and you alone and your program is responsible for maintaining the sanctity of array bounds. Here is a practice problem. Two character arrays are defined. One is called word 1, the other is called word 2. So construct word 2, we have read word 1, a very simple problem actually. Word 1 could be 5, 10, 7 characters. This time you assume that the character string which has been input is a set of continuous characters which is well formed and less than 29 characters in length, less than or equal to 29 characters in length. You assume that. All that you need to do is you have to create a new string in word 2. So please note that you have to take care that a backslash 0 is properly inserted because word 2 will have nothing to begin with. Word 1 has been read into by the input object. Very simple problem. This should not take more than 3 minutes for you to write. So write. We will discuss a variant of this problem but let me try and solve this together with all others who are still struggling. So look at the solution that is being worked out. We will work this out interactively. So those who have solved the problem may merely cross check whether what is coming up is actually according to what they have done or not. But all others may help me solve the problem. The way I am solving the problem is as follows. I have declared an integer variable i and an integer variable l which I think is the length of the string. I have used the string length function which I just mentioned. S-T-R-L-E-N in brackets the name of the array which is the character array automatically return the length of the string. So if there are 5 characters in the string which will be in position 0, 1, 2, 3, 4, 5th position will be backslash 0 and this function will return 5 as the length. Now all that I need to do is I need to take whatever word I have given. For example if the word was hello I want to create a new word which is O-L-L-E-H that is what the reversal is. So what it means? The 0th character goes to the position length minus 1, the character at index 1 goes to L minus 2. So if this is the string I would like the new string to be formed like this. The standard procedure while handling any array scan the array by using a for loop. Go over every element of that array. For that purpose I have set up this for loop here for i equal to 0, i less than l, i plus plus. Now if I just look at the ith element of the word I will actually be looking at H-E-L-L-O in that sequence. All that I need to do is assign this element to an appropriate element in word 2. So what should be the index of word 2 so that I get the reversal? Yes l minus i, l minus i will not be correct. So please note the positions here 0, 1, 2, 3, 4, l is equal to 5. Now you want the 0th character H to go to the fourth position of the new string. L is 5, minus 0 is 5, minus 1 is what? 4. So 0th character will go to fourth position, first character will go to third position and so on. Do I have word 2 now? If I just say see out word 2. Will I get the reverse word printed? No. Anybody why? That is because word 2 is not a well formed string yet. I have transferred all the valid characters of string word 1 in the reverse order in the array word 2. However, I have not put a backslash 0. So what should I do after this loop? Is this understood? Because I am forming the string myself. It is not being read by input statement. When I am forming the string, I can of course put any number of characters anywhere. But it is my responsibility to put a backslash 0 at the end of that array. So therefore I need to do this. Is this clear? Actually a simple problem. It illustrates some basic discipline that you need to follow whenever you are scanning an array. A simple counting to loop which says for I equal to 0, I less than L, I plus plus is adequate. But the way you transfer things is the jugglery that you do with the index of word 2. And as long as you correctly calculate that index, things will work out well. Now here is another practice problem. This practice problem is I now have, I do not have two arrays. Word 1, word 2. I have only one array word. I have as usual let's say declared in I, L. I might want some more variables later. And I say, now I want you to do exactly the same thing. Reverse the characters. But the reversed characters should form a word in the word itself. You don't have another array. So this is called in situ processing or in place processing. You have a word, the same array word should contain the reverse string now. You can immediately guess that the same logic that you did will not work. Take five minutes to solve this problem. But I would like you to really think hard. Check your solution with a sample data. I hope the problem is clear. You are given a word and you have to reverse that word in the same place, same location. Well you can add here. So you have calculated the string length. All right. I think everybody has figured out the solution. So please help me construct the correct solution here. The logic for solving this problem is very simple. I have an array of characters. Let us say this is 0th element. This is L minus 1th element. There is a backslash 0 somewhere here. We'll forget about it for the time being. Let's assume that this is the midpoint which is let us say L by 2. Now all that I need to do is I swap the 0th character with the L minus 1th character. Then I swap the next character with this character and so on. So easiest thing is to say I will scan this array from 0 to L by 2 only. And every time I look at one element, I will swap it with the corresponding element at the other side. Of course if I have to swap, I need a temporary variable in which I temporarily hold the value. So let us attempt to solve that problem by expanding my definition to include another care. Let's say see. Now I would like to define an iteration. So this iteration will go over the array elements from 0 to L by 2 minus 1. And for every element I scan, there is a corresponding element on the other side which I have to swap. If I have to swap 0th element with something, I must first take the 0th element value in a temporary variable. So I will write something like C equal to word I. Now in this word I, I want to stuff the corresponding element at the other side which will be something here. And then I will have to say word something should be equal to C. Do you agree that this iteration will do the thing? Please note that I am swapping values from 0 to L by 2 minus 1. A whole lot depends upon what index I write here, the same index I have to write here. I am leaving the last position aside. The LX position originally contains a backslash 0. It will continue to contain a backslash 0 if I don't touch that. So what should be the index that I should write here? Can you help me? Yes, L minus I minus 1. And here also I write the same thing obviously. Will it work? Anybody who thinks this will not work? Next question. You all are convinced that this will work because you are thinking logically about the two extreme element and their indexes and you think that will work. Should you not test your program with some test data? So which value would you like to choose to test your program? Why odd? Why not even? That's by the way a good lesson. Never test a program with only one test value because it may so happen that your logic works for that particular test value kind but it will not work for other kind of way. The only way logic could go vary is when your logic works for odd number of characters in a string but does not work for even numbers or vice versa. So the best way is to take two sample test cases. One test case with odd number of characters, another test case with even number of characters and quickly hand execute the program. If it works in both cases it should always work. In this case will it work? You sure? Okay. Pariche has written another program. He took my condition literally where I said that there is no other array available. You have to do whatever you wish to do within the world itself. He took it so seriously that he said that I am not permitted to use even an independent variable. Now the question is how do you swap without an intermediate temporary variable to hold the character for some time? Here is his solution. The for loop is not very readable. I think he hopes that C++ compiler should be able to understand what he has written. I at least cannot. But the logic is brilliant. He is saying that I need to do this reversal in situ that is in place. And I don't want to use anything else. He discovered that there is a spare character type variable element available which is not involved in any swapping. And that is the last location of this array which will hold a backslash 0. That is C++ characteristics. So all that he does is he uses this last element as a temporary variable. So you look at his swaps. WL is equal to WI. WI is equal to WL minus I minus 1. WL minus 1 minus I is equal to WL. So Lth element of W is his temporary variable C. Of course that value will get obliterated. Fine. At the end he simply says W is equal to backslash 0 where that is what it should be. I thought it was an interesting solution so I thought I will display this here. You think it will work correctly? Except that he is running the loop over entire I equal to 0 to I less than L plus 1. Is that correct? So what will happen? It only tells us, it should tell us that seemingly simple things, seemingly innocuous things could lead to logical problems. It is extremely important to be careful. And that is why whenever you write a program you may be convinced that your logic is correct. But it is best to test it with a few test cases so that you ensure that things work correctly. The next question is how do we handle sentences? Sentences have many words. First of all we do not know how to read a sentence. Because we said that if we use C in the moment a blank comes the value ends. Now if I have a sentence which has many words and suppose these words are separated by multiple blanks. Then how do we read this entire sentence as a single string? Now there is a facility in C plus plus. There is actually a function called get string or get s. A library function called get s in bracket whatever is the string array that you have defined will actually read a complete string including blanks including any symbols till you hit a new line character. So get s actually gets an input string all asking characters till a new line character is hit. When you hit a new line character it is considered end of that string. C plus plus will store that string in the array that you have defined and will put a backslash 0 now in place of that new line character which was read. So get s is a fairly powerful function. It effectively gets you a line of text. It reads all input characters up to the new line character. So here is a program which count words in a sentence. A sentence is defined as a large array of 200 elements. Number of words is initialized to 0. Program asks you to enter a string but instead of using C in you use get s sentence. Please remember the fundamental difference. When you say get s in bracket and get array then the entire set of characters including blanks or any characters that you input up to the new line character will be read in. The new line character in this case signals the end of input and whatever characters have been read by C plus plus they will be stored in this array sentence and a backslash 0 will be inserted at the end. So all that you need to do is as usual to find the length of this complete string including blanks. You can actually use str alien function also by the way. str alien function has nothing to do whether there are blank characters in between. It just checks backslash 0 and returns the length. Exactly what is being done here. In short what you do by this code is actually done by that function str. In any case you find the length. Now if you want to just print different words on different lines and count the words just check whether this logic will work. I have the sentence array. I simply scan the entire array element by element. i equal to 0 i less than length i plus plus standard for counting loop. Now I know that there are words given each word is separated by a blank from the other word. So all that I do is I just check if I encounter a blank. I expect the first word to start immediately. The moment the blank comes the word has ended. So if the word has ended I will increment the number of words and print a new line character. Otherwise and I will say of course go back. On the other hand if I do not encounter a blank then I am scanning a character which is part of a verb. I will simply output that word the ith element of sentence. Do you agree then that whatever are the number of words that have been read in the sentence the characters will be printed constituting a consecutive word. A blank will cause number of words to be increased. Next word will be printed character by character and at the end I print numbers and say that given sentence can test so many words. Will this program work correctly? What if the sentence which I have typed I have told you that at the beginning there is no blank. But at the end also there may not be any blank. I might have just pressed return. Remember gaiters will get you all characters up to the new line. After the hello world how are you? Suppose this is what I have typed. After why are you instead of putting a blank I press return. Now what will happen in that case? This loop will be exited because I have exceeded length. But the numbers will not be upgraded because that is updated only if I find a blank. If there is no blank after the last word the number of words will not be updated and therefore I will get a count which is one less than the total count of words. Now this is a simple problem. But what I want you to think about is a problem in which I have words which are separated by each other from multiple blanks. For example and two more blanks. This is one line. Now this is a name. The name typically has first name and second name. So let us say I have f name as an array 50 elements. Last name as an array of 50 elements. They are declared as char arrays. I have as usual my sentence here defined as let's say 200 or whatever. Now my job is I have read this string in my sentence and I want to separate out first name into this last name into this. Another line may contain a name. For example another friend of mine that's the game. The fellow uses only one name. He does not have last name. So there will be no second name on this line. In which case I would expect your program to take the first name in the array f name notice that there is no second name and print a message saying that this line or this sentence contains only one name. Your program should also look at a possibility that's all. I am running your program but I am cheating. I am typing a few blanks and pressing a new line character. What will go into your sentence? Five blanks. Now what is the first name? None. What is the second name? None. If such thing happens your program is supposed to tell me please type valid name you idiot or any better English that you may think of. The point we made is that in real life you will get to handle strings which might be prescribed to contain something but they may contain variation and you should be able to handle this. Now one way of handling this is to simply work out as follows. Skip blanks. Initially I skip all these blanks. Then assemble f name. What do I mean by assembling? When I finish looking at the blanks and I see a first non-blank character I notice that the first name has started. Now I start putting these characters one by one in the RAF name. The moment I come across the first blank after these consecutive characters I know f name has ended. I will put a backslash 0 there and now start skipping blanks again and then I will assemble s name. Of course if I am careful I will scan any more blanks just to confirm that the sentence does not contain an idiotic third name at the end because if it does it is outside the scope of definition of the problem but I should still report saying that the sentence seem to have something else. If a sentence has only one name I should be able to report when I skip blanks I will actually reach the end of the sentence. So I should say the sentence has only one name. So do you agree that this logic will work? Except that this logic has one flaw. If I were to tell you that a given sentence has six words then would you like to use the same logic skip blanks assemble first words skip blanks assemble second words that program cannot generalize if you have n words for example where n is undefined you might have up to 50 words in a sentence some sentences may have only 5 words some may have 7 words some may have 20 words and if you are expected to write a program which will separate out every word you would obviously not like to store such separated words in individual arrays named as word 1, word 2, word 3 what would you like to do in such a case? you would like to define a two dimensional array let's say 100 by 50 which can account for up to 100 words each of 50 characters or 49 characters you still have your sentence here but now suppose somebody types a sentence hello how are you with all blanks on either side you would like to write a generic program which will go through the entire sentence and start assembling words into different rows for example after skipping these blanks you start assembling H-E-L-L-0 H-E-L-L-O and the moment you notice this blank here you would put a backslash 0 here saying this word has ended but now instead of going to a differently named array you will simply go to the next row of these two dimensional arrays and start assembling the second word when it ends you will go to the next row of these same two dimensional arrays start assembling the third word do you agree that this is a better philosophy of handling character strings? it is in general possible that you have sentences which are either they contain variable number of words like this or in some special cases they may contain information which is very similar in every sentence so that you have some advantage however that information need not consist of multiple words each of which is a character string that information could be something else for example this is a spreadsheet so a spreadsheet is nothing but a tool which permits you to enter information in a two dimensional structure for example here you can notice that the first column contains roll numbers second column contains corresponding names of people the third column let us imagine contains a lab batch number and the fourth column contains marks in some exam now if I have 500 students data like this I would have a large spreadsheet all spreadsheets whether they are prepared using Microsoft spreadsheet or open office spreadsheet or any other utility permit you to save the data in a special text format called CSV format CSV stands for comma separated value so if you save the data in a comma separated value format you actually get a text file like this roll number comma name comma batch comma marks you see now that you get a text file and you now know how to handle sentences in text file the difference between this kind of data and the earlier one that we saw earlier you had different words each word was a character string but here different components are not necessarily character string they are of course character string as they appear so this 10102 for example is not number 10102 as is internally stored it is still ASCII character code 1010 similarly this is an ASCII character code 12.5 but it should be possible for you to take these strings and convert them into internal format for example this you may like to convert into an int this you may like to retain as the name string this you may like to convert into an int this you may like to convert into a float now imagine that you have four arrays one array which is an array of ints which is roll number add another array which is a two dimensional array array of names third array which is array of batches and fourth array which is the array of marks so you have thousand elements array but the 502nd element of each of the arrays will represent the 502nd students roll number name, batch and marks this is the kind of string processing that you would be expected to do so I would like you to think about it there will be additional material on the web which you should look at which will essentially help you understand how such problems are solved alright thank you