 I am going to discuss three problems today. These were given as programming problems in the makeup quiz. The first one is pretty simple, although I was surprised that several people have made some mistakes even in that simple problem. The other two were not simple or straightforward, particularly the one which requires you to separate out words in a sentence which are separated by blanks or comma needs special C++ functions to be used. I had requested all of you to read those string functions, but apparently those who appeared for the makeup test had not read any one of those functions. I am not sure whether others have done so. If they are not, this is a chance where I will explain that problem as to how that problem can be solved without using functions, how difficult the solution becomes and how the problem could be solved more easily. The last problem which I had briefly discussed in a tutorial discussion with some people is about finding out properties of elements of an integer array where some elements may be repeated. So, a question was based on that. Again I will discuss that solution. So, the first question is about problem to add terms of a series. This could be the most straightforward question and I am really pained to see that people were not able to solve this question properly. The second question is about extracting words as I said from a sentence which are separated by commas or blanks and work out a particular solution. Basically, the problem was that if you have a sentence with words, words are separated by one or more blanks, the sentence may contain some commas, one or more commas as well. What you are required to do is locate a comma, take the word immediately preceding that comma, take the word immediately following that comma, concatenate these two words and print out as many such words as they will be there in a sentence. That was the problem. The third problem was finding out unique integer numbers in a given array which may contain some repeated numbers. So, this was the first question. It was required to add n terms of the series. This problem is pretty simple. So, if you have to add terms, two things we notice. Each term differs from the previous term by an amount x square. So, obviously, when I set up an iteration to calculate the terms, I can simply say term is equal to term multiplied by x into x. Now, there are two ways of handling the alternate signs. One approach which many people have attempted is to take the term number i with i starting as 1 here, 2 here, 3 here, 4 here and so on. So, what they are saying is that if the term being added is an even term, then the sign is negative, otherwise sign is positive. So, many people have done something like this which is perfectly fine because when they add the sum, they multiply it by sign. Of course, the sign itself must be defined as integer and so on. So, this is one way. The other approach is to simply say term is equal to minus term into x into x. So, no, no, one is the starting point. X raise to 0 is the starting point. That is the first term. Term into x square is the next term. Term into x square is another term and so on. So, for example, if I start with, you take either of these which is actually x raise to 0. When i is equal to 2, term into x into x will be actually x square because first term was 1, 1 into x square. Then the third term, again, term into x into x and that will be x raise to 4. So, terms will get correctly calculated if my first term is 0. Now, all that I need to ensure is that I do appropriate test both for input and for initialization of the variables. Many people have made this mistake. Let us go back to the previous slide and read the words carefully. It is required to add n terms of the following series for a given value of x less than n, less than 1. n is given as input. Another value delta is also given as an additional input. Delta defines the lower limit on terms to be added. A sample value was given 0.0001. 90 percent of the people have used 0.0001 as the delta value without reading the question. That was only a sample value. To say that, delta will usually be a small value. By no means that is the value asked for. Delta is an additional input. You are required to read delta. You are required to read n. You are required to read x and you have to verify whether x is less than 1. For the simple reason that if x is greater than 1, this series may become a non-converging series. It may diver as x increases. Of course, you have alternate plus and minus sign, but that does not impact the final term which will be x raise to n and as n tends to infinity, the positive term may rule the rules. x should not be greater than 1. Now, these things, you have to read n, you have to read delta and you have to read x. Then, you have to verify that x is less than 1. In an exam situation, people lose marks if they do not do these things. That is why it is important to remember these points. The solution is of course, straight forward. So, here is what I am doing. I am defining n and k. k is my counter. I am defining x sum, term and delta as floating point values. I collect the values of n, x and delta as input. I check if x is greater than or equal to 1. If x is beyond limits, stop execution. This is a small squiggle which people do not remember. If you have to check the opposite of what is permitted, what is permitted was x less than 0. Then, opposite is x greater than equal to 0, not x greater than 0. It is a small thing, but you should remember to be very meticulously correct. If that is so, the approach that I am using is invalid value of x. I just return with some negative things. So, that rest of my program is not encumbered by additional indentation. The other way of saying is if x less than 1, then put a opening curly bracket and write the entire program. At the end, again say else invalid x, but by the time you have completed the program, that else clause people will not remember easily that that relates to the first condition. It is best that whenever you are examining conditions of the type which will make the entire execution of the program invalid, it is best to check those conditions and get out rather than continuing with the program. This makes the remaining program very straightforward and simple to look at. The program itself is simple. I start with term equal to 1, I start with sum equal to 0, and I start with a count equal to 1. And while count is less than equal to n and the absolute value of term is greater than delta, I say sum is equal to sum plus term, term is equal to term into minus x star x. So, that approach that I have used is to use the minus sign as a part of term calculation, so that the terms themselves become positive and negative alternately. But you could easily take the other approach of using the sign and so on. It makes the program slightly complicated because you have to check whether k modulo 2 is 0 or not. Now, that even an or a whole lot depends upon what was your starting point. For example, if you start with sum is equal to 1, assuming that 1 is taken care of and say first term itself is x square, then the first term itself has to be subtracted, second term has to be added and so on. So, you have to be careful in the sequence in which the computations will be done by the program that you write. Once you take care of that, you will always get a correct solution. So, this was a very straightforward program, but I want you to note these small points here because these are the small points which may make you lose unnecessarily some vital marks in an exam even for simple questions, do not do that. It is useful to write inline comments, almost no program written in an examination contains comments, almost no program. That is bad habit. What it means to me is that even when you write programs on a piece of paper, say for example, for your project, you would not be writing comments. You would invariably write comments after the program is entered, after you have tested it and when it works, then you say, ah, Fattak said comment should be there. So, let me put some comment. So, comments you are putting almost always as an afterthought for some idiot asking me to write comments. Otherwise, I do not think that is essential. This stems from the fact that you genuinely believe that the program that you write are to be used only by you, are to be read only by you. How many times I have said that is incorrect? In real life as professionals when you write programs 99 percent of the time, the programs will be used not only by you, but at least someone else. Either your colleague worker somewhere or maybe somebody else in the world and which means that as a professional programming practice, you must learn to write comments and comments cannot be an afterthought. Invariably, if you write comments first, leaving blanks for code, it is much easier to write code. So, look at the some sample comments that I have written for iterative computation of the sum. We will set value of term to 1, which is the first term, sum to 0, a counter variable k to 1 and at the beginning of each iteration we will check if k is less than or equal to n, if absolute value of term is less than delta. In any of these cases, I will have to do what? Terminate. I forgot to write the word terminate here. In any of these cases, I should get out of the loop and of course bring the final sum and so on. Now, these comments would be helpful for you only. Basically what these comments are doing? Whatever initial approach I thought of in my mind, these comments are consolidating that in a natural language. Now, whether you write it or whether you keep it in mind, it is okay with you ordinarily, but when you are writing professional programs, that is not okay and you cannot become a professional programmer only when writing programs for someone else and otherwise remain an amateur program. That is not workable. A discipline has to be a part of you, whether you are writing in exam, whether you are writing in a piece of paper or whether you are writing through. I would strongly suggest that writing such comments would not take more than half a minute or one minute per program that you approach. In fact, most of the time when you read a problem in an exam, you are thinking very rapidly. The problem is you rarely write down your thinking in words. If you spend one minute in writing those words, you will find that your own writing of the program becomes easier related. Anyway, that is just a suggestion. You can follow your own style. This is the next problem where a text sentence is given as input. Since not all of you appeared for the make-up test, make-up quiz, not all of you might have seen these questions. Here are some examples given. Hello, comma, how are you? Notice that there is a blank before comma, blank after comma and there are blanks between these words as well. Our program should print hello, how? Because there is only one comma. There is one word before hello, one word after hello, that is how, one word after comma. So, hello, how becomes a word that is to be printed as a single word. Here is another sentence. He came, sat on a chair, drank a glass of water and read the paper. So, notice that there are one, two, three commas. On either side of the commas, there are words like came and sat, chair, drank, water, and so the program should print came, sat, chair, drank, water, and so the idea here is not to create funny words out of nowhere. The idea here is to find out whether you have understood how to handle character strings. A character string is a very peculiar case of the data by the way. We have not discussed this earlier, but it is important to understand what is so peculiar about character strings. Character strings represent variable length information. Have you noticed that? So, a fixed length is an int or a float, for example. Int is 4 bytes, float is 4 bytes and it does not matter how you write the value in asking. For example, an integer may be written as minus 1, 2, 3, 4, 5, 6 or an integer may be written as 3. It might appear 3 is just one character long minus 1, 2, 3, 4, 5, 6 characters long. Is it a variable length information? It might be so as far as we are concerned when we read them as characters, but internally they all get converted into fixed lengths. The advantage of fixed length data is that they can always be referred to by a single variable name and that the operations on them are perfectly defined. You do not have to worry about where the data is starting, where the data is ending. So, for example, when you write term is equal to term plus x or sum is equal to sum plus something or term is equal to term into something or k equal to k plus 1, you are very sure that the values that are participating are fixed length values. There are not only instructions to operate upon those, there are actually hardware circuits to operate upon all that is very not. Variable length information on the other hand makes it necessary for you to find out how long is a given piece of information. A name for example, so if I write d perk b quarter, now this has certain string length one blank one blank here, the other one could be k v is one other name. Now, these are all names, there are multiple problems with character strings that we face. First of all, we would like if this is an entity called name, I would like to have my familiar concept of first name last name or first name middle name last name. Not all societies in the world follow this dictum. This is essentially coming from a particular western segment of the society which had first name, last name, middle name. Indian names typically do not follow this. Now, if Somayajulu was to write his full name that would take perhaps about 70, 80 characters and you cannot figure out which one is the first name and which one is the last name. So, these are semantic problems associated with names for example, but even values if you take them as names, as simple names as first name, last name, middle name, there is no guarantee that the names will be of fixed length. You take any English sentence as we saw there, multiple sentences if are given then each sentence would be of a different length. One sentence may contain 5 words, another sentence may contain 20 words, a sentence may contain 3 clauses separated by comma, another sentence may contain 12 clauses separated by commas and semicolons and whatever, whatever, whatever. Handling such information is hard. Invariably you have to handle string of characters in many situations, in many situations you have to handle string of character. How to handle them is very essential to understand. The standard mechanism that we have for handling character string information is one, the basic unit like care. So, if I say care say C then I get one byte allocated to C. So, I can put any one character inside this, that is it. This is a fixed length value, there is no problem. But if I want a string, now I suddenly get, I suddenly get 10 bytes. How many characters of the string can I put inside this? 9 and not 10. Why? Because I would like to put a backslash 0 here. Now, what is backslash 0? Ordinary sentences that you and I write, do we put a backslash 0 ever? Never. So, this is a very unique way in which C plus plus has decided to handle strings which are of variable length. Notice that if a string is 20 characters long or a string is 30 characters long, a string is 5 characters long cannot be determined otherwise. Either for every string I have to maintain two things. One a large array in which the actual characters are stored and another the length of the string as an integer number. Instead the designer has decided that a backslash 0 at the end will indicate end of a string. Is it necessary that backslash 0 will always be at the end, that is in the ninth position? This is not necessary. For example, if I defining the string I assign it a value hello. How will this string look like? I will have h in the 0th byte, e here, l here, l here, o here and C plus plus will automatically put a backslash 0 here. Automatically. So, that means whenever a string assignment takes place either because of the input or because of the initialization C plus plus will insert a backslash 0 immediately after the last character of the string. Similarly, if we are constructing words on our own or we are constructing strings on our own then it is necessary we can copy any character into any position of an array. And after putting all the characters of a verb we must put backslash 0 at the end if we want that string to be recognized as a valid string particularly in the context of functions of C plus plus being used. So, that is our response. In this context then our approach to extraction. So, this was about handling variable length records. We know that the records will be variable length a sentence can be any long we have no clue. We have to make some assumptions may most people do not even state that assumption that for the solution that I am presenting I assume that the sentence is contains at most 79 characters. We will just have care sentence 80 and that means the reader has to understand that the person is not handling any sentence larger than 79 character. Should you not just mention somewhere that I assume that the sentence should be 250 characters long or 100 characters long or 79 characters long or whatever say useful comment to write anyway. So, first after reading the sentence what should I do this is the approach to solving the problem I got a sentence I can use the get as function or get string function to read the entire sentence. Alternately I can use the get see get care function and read one character at a time any sentence which is input on the keyboard whenever I press an enter what happens a new line character goes in. So, if I am reading one character by character the moment a new line character comes I can assume that the sentence has ended and put a backslash 0 myself. Alternately I can read a string a get as statement will get you the string up to the backslash l that is the new line character with the new line character being replaced by backslash 0. You will get a proper string insert the point is this string could be 50 characters long 70 characters long a whatever if I say at most 79 characters long then I have to contain for a sentence which could be anywhere between 0 characters to 79 characters that means somebody goofs up and just presses enter there is a 0 character string form whatever the sentence is what is my approach to solve the solving the problem to recapitulate what problem are we solving given a comma or given a sentence I sold it a comma look at both sides of that comma take those two words which are immediately preceding and succeeding comma concatenate them and print them. So, I say look at a comma if comma is found I notice its position which I call C pause let us say and I do the following from that position C pause I scan backwards and collect the word up to a blank before it and also I scan forward skip any consecutive blanks collect the first word up to a blank after it is important to form a mental picture of what will happen here. So, let us look at a sentence there are some words like this let say there are three blanks before this comma and there may be blanks here let say this is one sentence. So, look at what we are trying to it is useful to take the sample sentence and analyze your own algorithm in your mind we have said first we look for a comma. So, where will we get this comma we will get it here. So, this is going to be my position of the comma which I will call C pause I have to concatenate came and sat these are the two words on either side of the comma how will I look for the word came I start looking backwards from this point first I will have to skip all the blanks there are two blanks here there may be zero blank I do not whatever blanks are there I must skip. So, the skipping logic will be what keep on examining characters backwards as long as they are blank forget it the moment you get a first non blank character mark that point this first non blank character must be the end of the previous word. So, I got the end of the previous word now I will set up another iteration in this iteration I will look for blank character is the next character blank next character any character which is not blank is part of the word. So, I will keep moving forward till I come to this point because this point is first blank. So, from the moment I discover the blank I will come out and when I discover a blank I will go back here and point this here. Now, I have got two pointers one the beginning of the word one the end of the word second I now scan forward where do I scan forward from from this point onwards please note that the scanning backward and scanning forward can begin from C pause minus one and C pause plus one because C plus itself is a comma I should not look at it again. So, when I look forward I again skip over the blanks. So, I will have I will be looking for the first non blank character going forward as long as they are blank I keep ignoring them the moment I find a blank I set a pointer this is the start of the next word. Now, I will start looking for a blank. So, when I look for a blank I move like this and I will come across this blank. The moment I come across a blank that means the next word has ended and it has ended just one position prior to this. So, I will have to reset this to this please note that things become easier when I write an example when I write down where the pointers will be then I know exactly what I need to determine I need to determine four set of pointers for every comma the start of the previous word the end of the previous word the start of the next word end of the next word. Once I got these two I can concatenate them by copying these many characters into a word and outputting it having done this for one comma what should I do now I look for another comma. So, I have to look for this comma where from should I start looking for this comma I can actually start from C pause itself, but that is not essential because I have located a word there is only one problem what is the guarantee that the next word say S which is S here sat is actually a comma only say somebody gives you successive commas you may have a problem in this logic. Therefore, one more assumption you write we assume that the sentence is a well formed English sentence that means some idiot is not giving me an input as he came comma comma comma sat comma comma done that is incorrect English we do not expect that to happen here in real life syntax errors of even this kind will have to be corrected. However, in the second approach to this problem I will show you how standard C plus plus functions help us locate these things, but this is clear how you will analyze the sentence. So, I am looking for a comma then look backwards look forward in either side I collect a word I need two pointers to describe beginning and end of the previous word two pointers to describe beginning and end of the next one then I go to the next comma and keep on doing this if I do not find in a comma my job is ended here is the first solution. I will be putting up these solutions these old slides in the moodle. So, you can read this, but here what is important to note I will just point out first I am asking the user to type a line I am getting a sentence and I am calculating the length of the sentence n cap since I am going to actually scan for comma I know from where to where 0 to n cap minus 1 are the number of characters in that sentence and I just print the sentence for the sake of now I say current pos is 0 while curr pos is less than n cap. So, notice that I am not using a for loop I am using a while loop. So, I am starting with current position as 0 and as long as current position is less than n cap I will search for a comma what do I do if sentence curr plus plus plus not equal to comma continue please note what it does it simply breaks it simply breaks the loop the continue means go to the next iteration. So, it does not matter how many thousand sentences I have written that if the sentence curr pos plus plus is not equal to comma I will come back to the next iteration is this is this construction understood curr pos is here I look at this is this comma no it is not comma then what will I do I will go to the end of this while wherever this while ends here and I will come back this is what continue does continue means go to the end of the iteration and come back again. Now again it will check while curr pos is less than n cap curr pos itself must change if the curr pos does not change I have a problem luckily I am making the curr pos change within the if statement itself by saying curr plus plus plus this statement is equivalent to the following is this clear if the curr pos position of the sentence is not equal to comma then do the following increase curr pos by 1 and go to the next iteration. So, this simple if statement with divide will ensure that when I come to this point I have located a comma and that comma is located at say curr pos I note that curr pos position into c pos which is let us say comma position is actually not necessary again continue to use curr pos I have extracted that into a separate variable c pos and I am saying I found comma at c pos and comma count plus plus because I want to count how many commas I have so many words I should output at the end now I try to get the word before the comma. So, recall the logic I have to scan backwards. So, that means any for loop that I said will start with some high value and go towards lower value. So, increment will be now decrement I am going backwards and what am I looking for first I am looking for blanks I want to skip all the blanks notice what it does skip blanks just before comma for I equal to c pos minus 1 I greater than equal to 0 and sentence I equal to blank I minus minus there is no body in this for loop you understand this construction ordinarily I would have to setup a for loop which will say for I equal to c pos minus 1 to I greater than equal to 0 because I have to go all the way backwards and I have to examine whether any character is blank the moment I find blank I have to get out of this I am doing that inside this itself if sentence is equal equal blank. So, sorry if I find the blank I have to go to the next character I am skipping the blanks before my comma. So, this particular for statement will skip all the blanks before the comma and rest at a position where I have a non blank character. So, that means whatever is the current value of I at this juncture is the first non blank character before comma that is when I have come out because this sentence I was not equal to blank that is when I got out if it is sentence I was not equal to blank I would have got out either that or I would have hit the beginning of the sentence in which case I would have been 0. So, I am saying I is at the first non blank character before comma or just before the sentence. So, this is the end position of the previous word the first non blank character is the end position of the previous word if I is less than 0 because if I would have hit this then I equal to 0 it would have executed with I minus minus it would have become I minus 1 there would be nothing to test I would have come out. So, that means somebody had put a comma right at the beginning there is nothing beyond that that was wrong of course, but I am checking for it and I am saying that I flagged previous word as 0 no word before this else I have found the word. So, notice what I have done I have skipped 2, 3, 4 as many blanks are there before the comma and I have come to the first non blank character at the first non blank character I say the previous word exists. So, the flag is set to 1 end of the previous word is I wherever my non blank character is that is the end of the previous word. Now, for I equal to end previous word and sentence I not equal to blank. Now, I am looking for a blank because the blank before the previous word will mean that that was the start of the word exactly the same logic I use I have just gone past the word. So, previous word begins as S T P R V word start previous word equal to I plus 1. Notice I told you when we let us go back to the previous example here then I am going backwards I am looking for the blank I will come get the blank here. So, start of the word is actually one position to the right. So, that is what I am doing here I am saying start of the previous word is I plus 1 the I will be the one where I found a blank which was the first non blank character when I come out here word is from this word to the start word to end I do the same thing about getting the word after comma the logic is exactly same it works in the forward direction. So, I say C plus 1 and I less than n care n care is the length of the sentence. So, because this may be the last word in the sentence I similarly find that out if I do not find any word if I find end of the sentence I say flag next word is 0 otherwise I say flag next word is 1 start of next word is I since I have found a non blank character after comma I now to search for more non blank characters forming that word there may be a single character word a this is comma a a blank paint. So, a is a single character word in such case start pointer and end pointer will be at the same point, but otherwise if it is a longer word I will extract it and I will find out end next word as I minus 1 last time I had to push the pointer here also I had to push the pointer backwards I minus. So, the word is start next word to end next word now I have to compose a single word. So, what do I do I will I have to first check whether the word exists or not because the words may not exist on either side. So, I say if flag previous word is 1 that means there is a word. So, I collect words later from the first word for I equal to start word to end word I say sentence I goes to word k plus plus. So, word is actually a etcetera word is what I am assembling my characters are in sentence. So, when I pick out characters from the sentence and put them in word I need a separate index for that word that is what I am using k. So, I said k equal to 0 and I take the first word suppose the first word let us say two words where this word was came and the next word was sat. So, the word came must be first copied this came might be starting from fifth position in the sentence and going up to nine position I do not know what it is. So, from start word to end previous word I will assign this c here a here m here and e here and that is why I see that word k plus plus equal to sentence I make sense I start with k equal to 0 this is equivalent of saying word k equal to sentence I semicolon k plus plus. So, I will put c here add 1 to k a here add 1 to k m here add 1 to k and k would be automatically pointing here when I come out because of k plus plus. Similarly, I go to the next word and I copy the next word again in the same fashion in the process I will get however, my job is not ended is this a form of proper word it is not a proper string I have merely copied the valid character, but if I print this by saying c out word c plus plus will get confused because it does not find backslash 0 at the end. So, therefore, I put backslash 0 at the end here word k equal to backslash 0 remember k has already incremented. So, I put backslash 0 that means after t a backslash 0 will be kept this is how I assemble 1 word and at the end of course, I output that word I set curve pause is equal to end next word plus 1 because I have remember I had located 1 comma I still have to go forward in that sentence to locate another comma ordinarily I could have started from the c pause itself, but whatever was the c pause after that I have located a word there is an end of word so I could start from that point I am just reducing the amount of search to be done however, that is not important I could have started from the position next to that comma. Now, in this solution notice a couple of things that you were required to do first you were required to copy words from portions of the string in instead of using string copy instead of using sub string which are built in functions which actually permit me to extract portion of a string and put it somewhere else I can copy a string from one string to another I can concatenate two strings there is a concatenate function. Now, if you have read those functions and seen those example usages you could make this program much shorter I wish to discuss an alternate solution and in the context also explain to you a much harder problem which often surfaces this problem is called parsing sentences. So, when we try to understand any sentence in English all of you are familiar with the word parsing in English grammar you would have come across it. So, basically you say there are phrases in a sentence when you parse a sentence you say what is the subject what is the object what is the whatever what a different components it is important always to be able to extract different components of a sentence in order to understand that sentence. Now, that is why you are trying to understand a natural language, but in programming this thing is also often required where is it required for example, collecting words separated by blanks you are analyzing a text sentence you are analyzing a document 20 page document and you want to find out the key words in that document you understand what is the key word a document might be describing the Iraq war another document might be describing genetic sequence a third document might be describing how pointers are handled in C plus plus if you are just given these three documents unless you scan through them you cannot find out what they are about. So, usually any document book paper etcetera will have some key words key words will say look this is semantically important. So, if you are searching for some material on this word you find it here how do you extract key words you extract key words by first analyzing all the sentences in that text then finding out words which in some other context have occurred again and again. So, there is an ontology definition which defines the important key words for a particular domain and then you locate if they are there if they are there if they are there all the Google searches and all are all based on indexing like this locating words. So, Google for example, is able to index practically every word that occurs in any document for the simple reason that such thing may be required we should at least be able to separate out the words in a document that is the purpose of saying that collecting words separated by blanks. The second example is the spreadsheet example you remember I had given a spreadsheet example that if you have a data in a spreadsheet and you save that spreadsheet as comma separated value then you will get roll number comma name comma marks comma something. Please note that this is a variable length record which is a characteristic record some roll number is 8 characters some is 9 characters some name is 30 characters some is 20 characters some marks 32.5 will be 4 characters some marks 7 will be 1 character. So, you are actually talking about variable length fields. Now, given this string if you have to write a C plus plus program to extract all information about roll number name marks in quiz marks in say mid say marks in this etcetera etcetera how will you do that you will have to extract fields which are separated by comma. In short the requirement to separate out words which are delimited by commas or blanks or any other delimiter semicolon question mark could be a common requirement. Such a requirement is stated in terms of programming context by calling the extracted words as tokens you know token you get a token when you go to bank or hospital or something you are standing in a token number 5, token number 6, token number 7. A token is given to each person who comes in order token in the context of a sentence is token number is first word after which a delimiter comes token number 2 is second word which comes after the which is separated by delimiter. A delimiter is one or more symbols which separate out these words. Since this is a very common requirement C plus plus has special functions precisely to do this. You give a sentence to that special function and iterate around that sentence by making successive cost to that function you will be able to extract token 1, token 2, token 3, token 4 which are separate words which is precisely what you want to do here. This is called S T R T O K oh I am sorry I am not written there this is that special function S T R T O K or tokenize the string of course a single function will return only one value almost all C plus plus character string manipulation functions which are part of the C string library will return a character pointer which means they return a pointer to a string a single string. But a sentence may have 20 tokens 30 tokens 20 words 30 words how will a single call return it single call obviously cannot return. It returns a pointer to the first token but it has a special feature by which if enough I will do you make successive calls to that same function with a special dispensation then you will get successive tokens from the same center. It is a very beautiful and a very neat small function. So, let us look at this here is a token extraction. So, this is the sample sentence he blank blank came blank blank comma blank sat comma blank no blank anywhere blank blank blank blank water blank comma and etcetera. This is a sentence I say this sentence is delimited by commas and blanks I do not know how many commas I do not know how many blanks some words are separated by three blanks some word is separated by only one comma. But I do not care what I want to do is I want to extract all words which are separated by either comma or by blank. So, the delimiter that is going to be used is a blank here which is not seen and a comma there are two delimiter blank comma. So, what I am saying look Mr. C plus plus I will give you a sentence I am telling you that delimiter is either a blank or a comma now give me all the tokens which are separated by either of these independent of how many of delimiters appear. That means, if there are ten blanks skip all those ten if there are three blanks a comma and two blanks skip all of them. Just give me character strings within this sentence which are between any one of these two delimiters leading blanks trailing blanks leading commas trailing commas all to be forgotten. Can you see that it would not be a very straight forward thing to do for us if we were to write a conventional program examining all characters. So, let us see what this does if I say P 1 is equal to S T R T O K T string blank comma this will give me a pointer to the first token what will be the first token here the word he, but that is all I have got how do I get the subsequent words which are the words came sat, drank, water and red all these are successive words. To get these successive words C plus plus says that make these calls again and again and again, but with a difference P 1 is equal to S T R T O K null comma blank comma this word null remember originally I gave a T string. So, it was trying to find a token in this T string, but when I say null comma something what is what am I specifying what should it find it C plus plus S T R T O K has a special meaning if the first parameter is a string then S T R T O K assure that it is a first call being made it takes that string copies it internally into a temporary string locates skips over all the leading delimiters locates the beginning of the first word and returns that as a pointer extracts the first token by the way. So, if the first word is H E thus temporary string that is returned is H E backslash 0 and the pointer to H is returned to you. Now, inside that large temporary string that fellow has already said that H E has been returned so he will put all those characters as null and start from this point be ready in the temporary string that is if you again call S T R T O K without giving me any new string then I know I have to start looking from this point forward. So, there is a memory built into that point, but how do you tell C plus plus whether it is my first call or subsequent call the convention is that if you make a call to it without giving the first parameter as a valid string you give null, null means what I am not giving you any string whenever C plus plus notices such call it says ah that fellow has given me an earlier string I would have extracted something before that. So, wherever I was he is asking me to go from that point on next call will get me the next token and the C plus plus internally will automatically chew up all the part of that sentence up to the second token and remember that it has to start from the second point this will automatically. So, if I keep a while loop it will automatically go through this extracting next token when it has completed the entire sentence and there is nothing here read the paper after that there is no string to be extracted at that time S T R T O K will return null. So, obviously when it returns null I have to stop to explain this more clearly I have put an example he came sat drank water and read the paper this is the sentence when I say P 1 is equal to S T R T O K T string comma blank comma here P 1 points to H E that is here this H E is the string which is written and P 1 points to H. Now, subsequent calls with null replacing string when I say P 1 equal to this again it knows that this string is somewhere here. So, it will search for the token here and will come out with came again I call it it was here earlier it will remember it will starts keeping all all the delimiters find the next token sat it will return sat. So, can you see a simple one call gets you the first token and subsequent calls in a simple while loop get you all the token can you can you write a program better than this it is not possible any individual evaluation of individual characters and then extracting these tokens etcetera is always going to be more time consuming programming. Now, if you had read those functions which I had said read all C string functions they are there on the side in C plus plus tutorial each one of these functions is explained string cat concatenation string copy replacing some characters by something else all kinds of things string talk is one of the important functions since it is not a very easy function to understand I thought I will explain that thing, but all other functions by looking at the examples you will be able to understand what they do using this I have constructed a solution here like previous I read the given sentence etcetera. Now, here is a problem I have if I use this is an example where I extract all the words from a sentence this is not what I want to do in my solution but this is just an example. So, I take the sentence I say P 1 is equal to S T R T O K string I check the number of tokens which will be equal to I by the way because I am putting that word I plus plus notice that I this I I describe word 40 80. So, I am assuming that there could be at most 40 words in a sentence and I am extracting all of them I start with I equal to 0 and every time I get a pointer I will copy that pointer into word I and increase I to plus. So, this S T R C P Y permits me to copy a string pointed 2 by P 1 to a string pointed 2 by this notice that in a two dimensional array of characters when I give only the value of the row number that happens to be the pointer to that entire word in the in the second. So, by giving that I will be able to extract this one simply prints all the words in a sentence delimited by anything. You can try by putting comma blank semicolon exclamation mark question mark will stop and give your entire and new line character as delimited and try to run this program on a text file as input redirect text file. You will find that in your text document it will print only those words which are separated by any one of these delimited that is the power of this subsequently I have. So, this will extract for example, this will extract he extracts came sat drank water and read the paper there is only one problem with this we are not able to solve the question that we have remember we have to separate out based on the comma only, but if I separate out only on the basis of comma what will be my first token yes will it be H E E the delimiter is only comma the first token will be H E blank blank came blank blank because there are two blanks before comma the next token will be blank blank sat the third token will be blank blank blank water blank and the fourth token will be and blank read blank the. So, I still have a problem I have not got individual words extracted, but can I not I see now the possibility of a solution suppose I use this tokenizer twice first I use it to extract these segments or clauses as I call it. So, this is my segment 0, segment 1, segment 2, segment 3 of course this segment will have to be a two dimensional array because segment itself is a string. So, segment 0 so many characters segment 1 so many characters set, but I now know that if I have 4 segments then there are 3 commas in between if I get 12 segments there are 11 commas in between I know the number of segments now I take one segment or rather a pair of segments at a time 0 h segment segment 1 in this 0 h segment I find out all the words which are delimited by blanks and the last of the word that is written is my last word in the next segment I again use tokenizer to find the first word I need not iterate because I want to find only the first word I extract that concatenate the two and that source by problem. So, while I will put up this solution on the on the Moodle I just wanted to show you what I have done here very briefly. So, I have defined something called segments 80 by 80 at the most 80 segments and then there are words there are two pointers I have defined and there is an outward which I will form anyway after reading the sentence look at what I do I make tokenizer's call to collect comma separated segment. So, I say p 1 is equal to s t r t ok sentence delimited is only 1 comma I will get segment 1 segment 2 segment 3 etcetera etcetera the number of segments which are initialized to 0 are incremented every time I get a segment at the end of this loop I have got n segments separated. Now, for each of these segments I have n segments of the sentence and between each segments is a comma each segments contain one or more words separated by one or more blank. So, I look at each segment separate out the blanks and extract the last word of the previous segment and first word of the next segment using the tokenizer. So, there is a program which does that. So, for example, this extracts word from an earlier segment. So, this is a repetition and it copies the last word of this segment outward it does that using s t r c p 1. So, the last word n words minus 1 is copied here. Now, I copy the next segment to testing and run a tokenizer only once because the first word of this next segment is what I want and concatenating these two will get me my solution anyway. So, you can see that unless you are familiar with the use of c plus plus string manipulation functions you will not be able to solve a problem in a short time. You may be able to work out a complex logic, but you would not have time in an exam situation to solve that problem like that. It is therefore important that you appreciate the use of these functions. The third problem I had already solved I do not think I have spent much time in doing it. It was a simple problem, but I could see that people had goofed up in solving that problem. The problem was given an array which contains integer numbers some of which may be repeated. You are required to find unique numbers in that array and put those unique numbers in another array. There are multiple ways of solving this problem. One is you can sort the array in ascending and descending order. Now, you start scanning the array if any number occurs more than once it will come together in sorting. So, you can say over this number has come five times I will forget it if a number has not come more times I will copy that is one approach. The other approach is I take one element of that particular array and look at all the elements of that array it is like sorting, but indirectly I am doing it only once. So, if that and I keep counting how many times that element occur if that number occurs many times it is not unique if that number occurs only once, because when I compare this any one of these elements with all the elements it will match with itself definitely. So, if that happens once I will say it is unique transfer it to array. This is not a difficult problem, but people have made mistakes in this I have solved this problem and I have said I will put this up into model I would like all of you to look at that solution. Maybe you would like to first solve it yourself under an exam condition, because I expect this problem three to be solved in about ten minutes time, but if you take longer please examine the solution practice yourself on how the approach is you could use any other approach. The bonus question was actually simple it said that find out the sum of the second largest and the second smallest number of this array of B which contains only unique elements. The condition was that you should do it with a single scan of all the elements 90 percent people have sorted the array B and taken the second element from top and second element from bottom found the sum of course the sum is correct, but sorting means multiple scans for i equal to something to something for j equal to something to something the multiple scan that is not the correct solution. The correct solution to this is you have to maintain two variables largest second largest and two additional variables smallest and second smallest start with 0th element as largest and smallest and 0th element as second first element as second largest and second smallest and now you scan the subsequent elements changing the second largest and largest and second smallest and smallest as appropriate in a single scan you will get this. That is the only way to solve it in a single scan you can't sort it now it must occur to you that sorting means multiple scans. So, this may not have been a bonus question this may be a regular question you lose marks because you are solving it correctly, but not solving as per the condition we says the condition is that the program must use only a single loop to scan all elements. So, I will stop here.