 Here is another test question. This test question is more complex. It deals with string processing, although we have not discussed string processing specifically and it was intended to be a session. I would like to repeat by the way that this is not a course on programming. I expect all of you to know programming already. We are discussing how to teach programming. Today we are discussing how to evaluate programs with many others. But this is an interesting complex problem. I had set it in a, when I taught this course in 2011. So here the question is, a text sentence is given as input. The sentence contains one or more comma. The program is to be written that the test problem. Write a program which combines any two words which are on either side of a comma and outputs these two as a single word. If there are any blank spaces present in the sentence, these are ignored. Please note that this is actually the answer to the test question. The original test question appears in a proper question form. Let me illustrate what I am trying to say or what the program is required to do. If you read the original test question, it gives you actually an example of this guide. Some arbitrary sentence. What the question demands is, examine how many commas are there. Here is one comma here and here is another comma here. Now look at the word just before the comma and just after the comma. Combine these two and produce a single word. Came took. Not that it makes any sense, but it examines whether the student is capable of looking at all the characters in a sentence, find out where the comma is, form words of words which are proper words separated by blanks on either side of the comma and then concatenate them to get this. So output will be one word came took, another word came wrote. If there was another comma here and there was something else, I would include that word also. Suppose I have a comma, but nothing after that, then I am expected to combine later with nothing and produce an output later. There are multiple possibilities, but effectively it means the following. It means analyze this sentence. Look at this sentence. Go through each and every character. Identify where the comma is and separate out first the phrases of the sentence around this comma. By the way, this process is called tokenizing or token extraction. In fact, in C there is a special string function called S-T-R-T-O-K or string tokenizer. It has series of parameters, such as a string tokenizer function is able to isolate phrases which are separated by commas and some white spaces, whereas it need not be comma. It could be any delimiter which has to be prescribed as one of the parameters. We are assuming that the students are not familiar with the S-T-R-T-O-K function and they will solve this problem by doing the actual Godagiri of writing a program to read a string, analyze sentences and so on, analyze each character, each alphabet there and then separate out the words. I will not spend too much time in discussing this program, but I will just go over the program that I have written. I would expect you to look at this program and run it in the afternoon session. I will give you another interesting exercise immediately related to this particular program only, but we will talk about it in a minute. So here, these are of course the standard things. The program actually defines a sentence, defines some strings, defines word, defines string, the previous word and previous word that is start of previous word, end of previous word, start of next word, end of next word. Please note that I need to identify the words on either side of a comma and each word will begin at some position and end at some position and I need to know these positions so that I can combine or concatenate those characters and produce a single word. Obviously, this is not a very straightforward or simple problem. I start with a cursor and I say the current position is zero. I will set these positions there. I have multiple positions here. As I said, I will not spend time in analyzing this answer. I will only say the answer works correctly. However, it is a task that you should do individually and collectively. Now, by the way, you have formed teams. So, I would expect teams to work together in the lab system. So, team should work together, meaning someone should examine and explore one set of programs, another colleague should explore another set of programs, do that for 15-20 minutes and then discuss among yourselves what each one has found out, demonstrating the results of execution, etcetera, etcetera and coming up perhaps with some interesting alternatives or modifications. Suffice it to say that this being a fairly involved solution, you require to write a lot of code. You first search for a comma. So, in that sentence, from the current position which you keep incrementing, you examine if it is not blank, you continue. That means, you just skip character positions till you come across a comma. When you come across a comma, you identify that position and store that position here. Please note the trick. C pos is equal to curve pos minus 1. Why? Because while you are evaluating this, curve pos was being automatically updated after every examination. So, when you actually found a comma here, at that point, you will quit this if statement only by incrementing curve pos to the next value. So, the curve pos as you come out is not actually pointing to the comma, it is pointing to the next position. That is why you subtract one from it and assign it to c pos, which sort of represents position of a comma. This is one squiggle I am explaining. You will have to do something very similar at multiple points while writing this program. So, here is the segment which will get the word before the comma and there is a situation that the sentence itself starts with a comma. Of course, it is a stupid statement. No sentence starts with a comma, but somebody deliberately wrote such a sentence to confuse your program. This program takes care that if there is no word before this comma, then I set a flag saying previous word is 0. I will use this flag while concatenating the two words which I would have figured out from the given string. If the word is found, I will look for its start point now. So, I will do exactly the similar processing, keep moving backwards. The first blank I notice is actually indicator that I have gone beyond that word. So, I will go back one position and I say this is the starting point of that word. This was the end point I have noticed. So, followed by this, there is a comma. I will now have to do the same thing for the next word. So, here is a quote to do that. Get the word after comma. Again, somebody might try to fool me by giving an input where the sentence itself ends in comma. So, I determine whether there is any word after comma or not. If there is, I will proceed further and find the end of that word where I am now examining in this direction. Now that I have got these words, I have located the first blank character after the next word. So, I go just beyond the end of sentence. I come back and examine it. Having done this, I will say I will compose a single word from the two words found. How long would it take to write this program? To actually write this program, it may not take more than 15 minutes. How long will it take to figure out the correct logic including the special condition that there may not be any word after the last comma and there may not be any word before the first comma. These extreme cases are often forgotten by the students and we therefore reward less marks to them when they miss out on such condition. Such a test question examines both the ability of the student to think through a complex maze of, I would say, index computation and correctly remembering in the mind what is going to be the pattern of the sentence, how a comma here and comma there is going to look like, how there may be spaces before and after the comma and so on. Incidentally, such spaces in a text sentence amount what is known as white spaces. That means whether there is one space or five spaces, semantically or meaningfully they are just a space, they are called white spaces. In fact, people who write compilers often ensure that the compiler when it reads your program for compilation, it actually simply removes the white space. Such programs, by the way, are handled much better if you have a powerful object oriented paradigm to use which permits handling of lists. A list could be a phrase, a sentence and the list could be handled. It is almost like handling vectors or handling an array of arrays except that you do not have to do complex program. What I have shown here and which as I said might be of interest to you and notice an interesting thing, the entire sentence is full of characters and there is a backslash 0 only at the end of that sentence. But because I have formed a word, a word is properly formed only if it terminates with a backslash 0, I have to artificially give it. This was not the original word, I have combined two words to form a word myself artificially. So, I put a backslash 0 here to terminate the string properly and then I have put that. So, this answer is understood, slightly contrived, slightly complex, but given enough time I can write. Instead of C plus plus, I have got a program written in a programming language called Python. So, here is the Python solution. The Python solution does not read any string because it is just a sample that we were writing, we got this written. He came comma, he sat on a chair comma, he drank water and comma, read newspaper. This is the sentence. What are we supposed to do with this sentence? We have to isolate phrases which are separated by comma, take the word on either side and concatenate. So, I should get came, I should get chair, I should get and read as these strings which are concatenated to give me this adjacent word. The way this program is written is completely different unlike in C or what I have done even in C plus plus, a character string is actually an array of character and I have to examine each and every element which means I have to examine each and every character of a string in order to do, in order to reach some conclusion. However, if I can treat such phrases as a list, set of words as a list, then I can use the list processing concept. So, here is the way he starts it. First he creates an output string which is set to null string. Then he gets a set of list created by saying string dot split on comma. This split is an extremely powerful member function equivalent and it can split this entire sentence into multiple elements or multiple lists. So, this particular split will split this string into one, two, three and four lists. Each list contains several words. Each list is identified by an index 0, 1, 2, 3 just like any C array index. Now this split strings or split lists go to LC later. In Python, when you say for I in LC, it automatically determines what is the number of lists that LC has. In this case 1, 2, 3, 4 and therefore the value that will be picked up here for I in LC will be 4. This represents that I is an index which is valid in LC which is therefore 0, 1, 2, 3, 4. And when I say I dot split, it means I want to apply the Python split function on the ith list of what has been collected here. Once I do that, I append that to the output and print the output. We will see what the output is. So, that means in this iteration for I in LC which executes these statements, incidentally iteration in Python the scope is determined only by the position. So, if I have invented this, these three things form a block to be executed for this false statement. It is a very funny language. So, extremely simple language to use. The number of lines of code that you write for achieving the same function is very less and therefore it has become a darling of program. My objective in showing you this particular example was twofold. One, to say that there are modern programming parallel lines such as object oriented programming or list programming which can do a more complex task which took more effort to write a program in conventional programming languages such as C, Pascal, what have you, will take much less time to write with these modern programming tools. In fact, while we might be teaching C C plus plus to our students in the first semester, it is very desirable that occasionally we give them examples of code like this from some other language. That is how the first year students should get introduced. They should become good programmers using the language which is used as a medium of instruction, but because they should become good programmers, they should be exposed to multitude of such tools and components like this. Here is the last part where the programmer starts with an empty O list. This means an empty O list and then for I in the range length output minus 1, I will explain what length output minus 1 is. He simply appends the first word and the second word together to this growing list and just print the output list. Here is what you would get as an output. The input string is this. So, because I am printing during the successive iteration set up by Python, first I will get one element of the list. He came. Second you will get he came, he sat on a chair. The third I will get he came, he sat on a chair, he drank water and finally read newspaper. This is not written by hand. This is produced by a Python compile. So, these 5, 1, 2, 3 and 4 lists would be created automatically. The second part will produce this code is complex. So, I will not read it anyway. Do not pretend that we know Python, but I am mentioning this for those who are interested in extending their horizons if Python is not at all difficult to understand. So, what here he is actually doing the concatenation of the word previous to comma and word after comma and it is this which when correctly implemented results in came he chair he and read. In fact, the problem is fixed here, but since the he is trying to use this computer program to produce these 3 as a part of a list. So, he puts them into a list and these 3 are readable. Before going to the tea break, I would like to tell you that the code blocks environment that you have is also capable of executing Python code. In fact, there is a utility called code pad which is actually available on internet. It is online. You invoke code pad and you type in a code on a canvas that they give you a Python code and they will execute it and give you the result. In fact, this is how my student is actually my staff member and also an M.Tech student. A student from VRC, Nagpur, I remember and now VNIT of course because VNIT is one of our remote centers. So, this Pushpukh Borange, I asked him to write a short program to illustrate object oriented features of C plus plus instead he wrote in Python and later I discovered that the C plus plus library, the object library, rich as it is does not have a built in function equivalent of split. You have to do some Godagiri to get the phrases split differently. In fact, on one of the forums, there is actually an open source implementation available of a class and the member function split to give precisely the Python like facility for doing this. I notice a hand raise at Vivekanan College of Engineering. Before starting, I would like to quickly go over and find out whether someone has a comment to make. I am Malathi from Vivekanan College of Engineering for Manamakal. So, in the morning session you described about evaluation of student project and students questions. So, in that, can I use the negative marking system for evaluating our questions? Yes, very good question. Unfortunately, the auto graders are not yet intelligent enough to award negative marks, but in manual evaluation, we often use negative marking and in fact in the marking scheme that we stipulate to our teaching associates often include negative marks for some specific mistakes that people may make. You are very right. Negative marks can be included. Thanks for that question. Silliguri Institute of Technology, over to you. Sir, I have a question that last day you have given some assignments that we have to execute the sample program and we need to take the input from the file, but we did not even have to do that one. We just said that we went to that program project option, said program arguments, but after that that program did not execute. In that set. And we did that using the file, using the argc argument vector, argument counter, using that one we can do, but. No, in the set, in the set options, there is a box below that. So, in that set and inside that box, you have to type less than file name and you have to ensure that that file is in the same directory. It works. It works in most cases. I just try it out, discuss it with couple of colleagues and try it out. Otherwise, what I will do is, I will get some screen shots captured later and put that on the model so that you can try it at your place, but it works. It works all right. You can do both input and output redirection and output redirection. Okay, sir. And when need to change anything in the program code for that? No, no, no. Program is exactly same. Instead of your see-in statements taking input from the keyboard, they will collect input from that file. It is exactly the standard input or output redirection that you see on the command line. Please try it again. It will definitely work. Thank you. Xavier College. My question is, morning we are discussing about automatic evaluation system. In this, the students will be writing the missing codes, whether it is possible to analyze the efficiency of the program there? Very good question. In fact, a past student of the institute, one Mr. Vinay Kabra, who passed several years ago from IIT, a computer science student, has actually set up a company in Pune, which has created a platform which evaluates not just the correctness of a program, but also how well the program is written. The objective of that system is of course very peculiar. You see, when companies like TCS and Infosys and others try to engage employees, then suppose our students apply to them and as I said, 100 students or 1000 students apply and may be 100 people are able to write correct program. Now, Infosys or TCS does not have time to interview 100 people. What they want to know is that out of these 100 correct programs, which are better program and how do they define better? They should either be more efficient or they should encompass all possible cases of different input values and handle them properly so that for wrong input, proper error messages are given and so on. So, what this company's platform does is, it first gives the specification to the participants saying, you write a program which will do this, this, this. Then they give one or two test cases so that the participants can test their program before submitting. After they submit that program, the people behind have written something like 30 or 40 test cases and these additional test cases evaluate whether the program is taking care of all boundary conditions, whether it is behaving properly, even if erroneous input is given. They also have test cases which for example, will test, let us say some algorithm that is implemented first by let us say 100 input values, then by 1000 input values, then by 10000 input values thereby evaluating the efficiency of the program and what they do is they give a gradation report saying that all these 100 programs work correctly but out of these 100, these 5 are excellent, these 30 are good and the rest are ordinary and then the company like Infosys and TCS will invite only those 5 jokers for interview, others will be rejected. So, it is indeed possible to build such a system. In fact, there are couple of our M.Tech students who are trying to build such a system internally, it is possible to build that. Very good question and in fact, such systems will actually be essential and they should be available at each and every college so that students can themselves put their programs to such system and get to know how good they are programming. Thank you very much for this question. Let us now go over to the next example here. I will go through this example very quickly because the problem actually is very simple but in framing this question for test paper I have tried to give a realistic touch or a real life touch to this basic computational problem. The problem I have stated is balancing load in two trucks. So, there are two trucks each of which contain packages. Packages have different weights and the two trucks have different number of packages also. We assume that all weights are integer. So, 23 kgs, 17 kgs, 94 kgs, whatever, whatever, whatever. Now, obviously in most cases these two trucks will have a different load. This may have so many total kgs, the other truck might have less or more number of kgs. And the problem we pose is we wish to balance the load in both the trucks. More importantly, we want to balance the load in both the trucks by finding out whether it is possible to swap exactly one pair of packages between these two trucks to achieve this value. That means I should be able to locate one package in one truck. I should be able to locate some corresponding package in another truck. By corresponding I do not mean of the same weight because that will not change anything. But the weights are such that if I swap these two packages, then the trucks will contain exactly the equal load. So, the question is to write a program to aid determine whether such exchangeable packages exist at all or not. It might not at all be possible to do so. Second, if it is possible, identify the weights of the packages which need to be exchanged. Now, this is the question given. So, notice that first of all the student will have to translate this question into some kind of a programming or algorithmic structure. What data structure should the student use? How should the student represent the weights in a truck? How should he frame the requirement that exactly one pair of packages have to be exchanged in order to balance the truck load? How exactly the so called balancing of the truck load is to be represented in programming term? Many people believe that these are not programming questions. I differ and many of my colleagues differ. A very, very important part of the knowledge of programming is the ability to take a real life situation and map that situation in terms of programming concepts. When we solve real life problems, people will not specify in algorithmic terms. Find the sum of n terms such that each term is greater than delta. This is not how a practical problem is solved. So, one of the important objectives of teaching programming is to train our students in the ability to take a real life welding of a problem and translate it into programming terms and write a correct program for it. So, this was one attempt. Again, this is a question from one of my past offerings. This is how we should teach students to proceed with this. Initial thoughts on the program design. First, as I said, how do we represent the problem data? Now, since we require comparison of weights of packages from each truck with the packages weights of packages in the other truck, the best way would be to read the respective load values in two arrays. So, I have one array representing package loads, package weights in one truck, another array representing package weights in other truck. Now, we have to make an assumption that maximum number of packages can be so many because the moment we say array, array has to be defined with a physical upper limit. Since we are talking about trucks and packages, probably 100 is a good number, but this choice is as good as any. We will use integer arrays A and B because we have been given that the weights are integer and each has m and n elements respective. So, so far the program design has taken care of representing the data. Now, the representation of the problem in terms of algorithm. So, here we are saying that let us say that truck 1 has the total sum of the load as someone and truck 2 has the total sum of the load as sum 2. What we are looking for is one package with some weight x, a corresponding package with some weight y in the other truck. If y comes here and x goes there, then the total sum should match. Now, if x goes there and y comes here, the total weight on the first truck will be someone minus x plus y and the total weight in the second truck will be sum 2 minus y plus x because y goes away x comes in. It is actually simple once you write it, but I have seen many students goofing up in coming to this simple conclusion. This is absolutely essential in order to translate our real life problem in terms of a program, but once you do that the rest of the program is fairly straight forward. Again I have tried to give more logic for the program. For example, what should our program find? That there may exist pairs of elements of the type x comma y such that x is an element of a and y is an element of b and if these are swapped across the arrays, then the resultant arrays have the property that sum of elements of the new array a is equal to sum of elements of the new array b. Notice that this is a very precise and rigorous definition of what originally exists and what will exist after swapping of the packages. Now we look at the algorithm. So one possible algorithm is to examine every element in one array a and for each of that element which represents some weight let us say x, find if there is any permissible weight y from the other truck. That means you go through one array take one element. Now go through the second array all the elements and find out if there is a corresponding element y. So iterate on array b to find whether b has an element whose value is y. If there is no y for this x I ditch this element go to the next element. This way when I complete the entire iteration over array a and for each element I have examined all the elements. I still do not find a math I say sorry structure load cannot be balanced. But if I find one well I have found the solution. Of course there might be more than one correct solution but the problem requires you to find only one solution. Having reached this this is this is the part by the way on which students should be trained to spend a significant amount of time. Out of 30 minutes let us say allocated to solve this problem we should train our students to spend as much as 5 to 6 minutes to work only on this logic. Many students unfortunately simply rush into writing programs. Here is the program which is actually straight forward. I will not go through the details you can read it for yourself. As I said again this has been loaded on the model. So this is of course a description. Once again it is useful to emphasize to students that although during an exam setting with a fixed time they may not be able to spend so much time in writing comments. But it is a good practice to write in-line documentation for every program including the problem description major steps etc etc. So I read these arrays. I just define the necessary variables like sum 1 sum 2 x y and so on. Observe that first I will have to read the input of course. So I collect the number of packages in each truck and read the weights. This is just the entire input section. After that I find the sum of elements of each array because that is what I have to balance. So sum 1 is found out by one iteration, sum 2 is found out by another iteration. I just output that value. Now I do a curious thing here. I find the difference as absolute value of sum 2 minus sum 1. If this difference modulo 2 is not equal to 0 I conclude difference of the sums is not even. And I therefore say desired pair of packages does not exist. This is an interesting observation and anybody who makes this observation is actually a better programmer because the student is actually able to figure out that unless the difference is even why because it has to be divided by 2 and the divided value has to result in an integer because all weights are integer. So this is an interesting observation and just as some colleague mentioned that we should not give negative marking. Now it is unfair to provide negative marking if our student does not include this answer but we treat such additional answers worthy of bonus point. So we give plus 1 or plus 2 points on occasion. It is possible in our marking system it should be possible for some student to get 12 out of 10 marks for a question because the question has been written exceedingly well. So this was to illustrate that point. Rest of it is technique. The basic algorithm is stated here that these are the equations. So 2 y equal to this and therefore y is equal to sum 2 minus sum 1 by 2 plus x. This formula incidentally is important to derive because we are going to look at every element of array a which we represent by x and for each x we are going to look for y. So you must have a formula that if there is some x how do we calculate y? Once we calculate y we will look for y in the secondary. So as I said the rest of it is fairly straight forward. I run an iteration i equal to 0 to m minus 1. Find the value of y which is a possible replacement of t1i. y is equal to this. Now search for y in array b. I search like this. If I find something I said t1i to be exchanged with t2j. Please notice that this loop still keeps running and it actually finds all possible pairs although that is not exactly as for. However if nothing is found my found count which will increase every time a pair is found will remain 0 and I say sorry balancing not possible. So notice that the solution is not very complex but the ability to think through a physical real world problem may be slightly artificial because you do not balance the truck by matching exactly one pair of packages. I mean you would balance it by taking away any number of packages from here and exchange them with any number of packages there. Indeed that could be an extended problem and it is a harder mathematical problem to solve. However this is a good test problem with that. So there is a sample load data given. This is in the slides. Slides are also loaded. You can run this program and try any variations thereof in the afternoon lab. Here is another example which is slightly a harder problem. If x1, x2, xn is a sequence of integers possibly negative. Then for each possible sequence xi, xi plus 1 etcetera xj consider its sum called sij. Now write a program that reads in the sequence in order with n given at the beginning and prints out the minimum sum sij over all possible sub sequences. It is a well known mathematical problem. I will not spend time on this but to write a program you can do the Godagiri way by using two iterations or there is a very neat mathematical technique to find out the solution to this. A particular solution which was given by professor Divaan to me. Of course it is a well known mathematical problem and he is a wizard of algorithms in IIT Bombay and therefore he just strictly told me of the cuff that this is the solution. Namely that we calculate si using these rules simply. Si is ai if si minus 1 is greater than equal to 0 or it is ai plus si minus 1 otherwise. Initially a 0 is set to a 0. A is obviously the array of the given elements and s is the array of partial sums that we calculate up to any point i. So as I said I will not dwell on this.