 I am going to discuss the solution to the current lab assignment which I noticed that people are finding it very difficult. I am of course not giving any solution. All of you will have to figure out how to solve it using C++ programming language. I am just going to show you an alternative simple solution. We will then discuss pointers and with respect to pointers, we will revisit the function calls. Last time we had briefly discussed calling function by passing parameters by value and the alternative was to pass parameters by reference. We shall discuss that. If time permits we shall discuss character arrays and some features of character arrays. We had discussed this last time. We will take the discussion forward. There are several important announcements to be made. So I will close the lecture at least 5 minutes before that. The lab assignment for this week was that mid semester exam marks were entered in a text file. These are some sample records. You will all be familiar with sample records which have been put on the course model. Some of you might not have seen it if you have not seen the lab assignment already. But the idea here is that when you have large data, the people who enter that data into computer are completely different. They are typically data entry operators. In this case I got this data entered by my staff. Now they don't know the details about what role number means, what batch number means, etc. They enter the data in a spreadsheet as diligently as possible but they made several mistakes. From that spreadsheet I extracted the data into a comma separated record kind of structure which is what you see here. So for example if you take this record, there is a serial number followed by a comma, role number followed by a comma, name followed by a comma, batch number followed by a comma and mid-sem mark. This is the general structure but you will notice that in some cases some field value is missing. For example name is missing here. The batch is missing here. People who are absent were given negative marks. So since different people entered data, some people entered minus to some people entered minus one. Majority of the data is correct however. The problem for the lab assignment was given this data, you have to find the batch-wise average marks and class average. This is one of the assignments. Now your first target is to read this data correctly. Second target is to separate out the components corresponding to role number, batch, etc., etc. And then total of the marks according to batches. What I am going to discuss today is a simple method to find the mid-semester average marks for lab batches using a programming language called AUK. It stands for the initials of the three people who invented this language. This was invented in Bell Labs in 1970s. Alfred Aho, Peter Weinberger and Brian Carnegie. Has anybody heard at least one name, Carnegie here? Yes, Carnegie and Richie were the creators of C programming language from which has claimed C++ subsequently. All of them are great computer scientists. In fact there is a beautiful book by Peter Weinberger written in 70s called The Psychology of a Programmer. And the way he describes the human aspect of programmers, the observations are still valid today. If you get a copy you should read it. However, we are talking about this programming language AUK. Now this language was designed to be a very simple interpretive language which makes heavy use of string data type. It makes heavy use of associative arrays. You remember I mentioned associative arrays when we discussed the notion of finding out what we can say when we were discussing the image. We were looking at the pixel values and we were looking at the spread of the pixel values, the functions that we calculate. We used associative arrays there. Similar associative arrays but this time not with an integer index. We shall see what it is. And regular expressions. We don't know what regular expressions are. For the time being we consider them to be conditions on the data. There are some inadequacies in this language which led to further development of a new language called POR by some other people. However, we are looking at the programming language AUK. Let me quickly teach you programming language AUK in 5 minutes. This language actually can be learnt in 5 minutes based on whatever syntax and semantics you know about C and C++. So this is a language for processing text files. Text files are considered to be composed of lines. So each line is a record and a text file is record and record and record and record. Observe that the data file that you have to handle is exactly like that. There is one line per student and there are 850 student records. So each line is actually a record. What this AUK does is that it treats the file as a sequence of records and by default each line has a record. Then it breaks each line into what it calls fields. So effectively there are 1, 2, 3, 4, 5 fields. As you can notice in our data there are 5 fields. We shall see the details. Now what it is capable of is to read line after line, record after record, apply some pattern or apply some condition on each record and if that condition is satisfied execute an action that we have specified. If the condition is not satisfied nothing is done with that record. So simply it keeps on applying itself to each record as it comes. You can specify a series of conditions which are called patterns and for each pattern you can specify an action. The way in which you specify patterns and actions is rather simple. So let us look at some details of how AUK analyzes each record as it reads it. In our case each record is something like this. The fields which are separated by comma here, they could have been separated by something else. They could have been separated by vertical bar. They could have been separated by any symbol which normally does not occur within the value of any field. After all you want some separation to be indicated to the AUK programming language. Or for that matter to any programming language. Without such separation you can't deal with that data anyway. It recognizes each field by a dollar symbol followed by a numerical value. So dollar 1 means first field, dollar 2 means second field, dollar 3 means third field, dollar 4 means fourth field, dollar 5 means fifth field. If there are six, seven there are fields there will be dollar 6, dollar 7, dollar 8. What AUK does is whenever it reads a record it automatically breaks separate words into these fields. Ordinary field separated by the way is a blank. We can't use that because between the name there would be a blank and we don't want this to be considered two separate fields but a single field. So we introduce an artificial separator. We have to tell AUK that the separator of the field in our file is comma and not the ordinary blank. With this notion AUK separates out various fields as it reads records and assigns values to dollar 1, dollar 2, dollar 3 etc. So dollar 1 will be 13, dollar 2 will be 0, 9, D0, 7, 0, 1, 0. Dollar 3 will be Guru Raj Sureshwar, dollar 4 will be 7A and dollar 5 will be 44.5 which is the highest score in Midsim. Is Sureshwar here? Guru Raj Sureshwar? Yeah, let's give him a big hand. Well done. Now, I promise that I'll compile a list of all top score else. I've got lists from TAs but some TAs have said that the crypt sessions are still continuing. Anyway, now what do we want to do with such records? Remember the specification was that people who are absent have been given negative marks. Now those people are to be ignored in our analysis but everybody else has to be considered. So a simple pattern we figure out, dollar 5 less than 0. If any value here is less than 0, we have to neglect that fellow. Otherwise, we have to consider that person. One action we have to take, if this pattern is found, we should increment the count variable for absent students so that we can count the number of absent students. On the other hand, if this condition is not satisfied, that means the pattern is dollar 5 greater than or equal to 0, then we should take some other action. What is that action? Increment batch counts, increment mark totals and at the end print the accumulated results. This is roughly the statement of the problem. Let's look at how AUK solves it. This is the AUK skipped followed by execution results. This is half the program, the remaining half is on the second slide. And that's it. This program does everything. Let's look at what it does. It has no declarations, no, nothing, nothing, nothing. So AUK, whenever you use a variable, it assumes the variable to be whatever type is required whenever a data gets into that variable. By default, it's a text variable or string variable. But you can put numerical values. It will treat all the values as numerical values. Arrays, you don't have to declare arrays. The first use of array creates an array internally and the array is initialized to numeric 0 if numeric values are getting into it or it is initialized to a null terminated string if a string value is getting into it. So you see the simplicity at its best. Look at the first slide. When the AUK interpreter reads my program, it will say $5 less than 0k. Ups and count plus plus. This is like a C syntax. Notice we did not declare an ups and count variable as int or something. It automatically creates a space and it starts incrementing whenever any record matches this pattern. There will be a few which will match this pattern. Most, however, will match the second pattern. And what does AUK do when it encounters any record with second pattern? First, it updates one count variable. Obviously, this count is the total count of students who gave the mid-same exam because the ups and does separated out. For a negative mark, she will never come here. So count plus plus at the very end when all records are read will mean the total count. Count marks plus equal to $5. Remember this syntax? Count marks is equal to count marks plus the marks obtained. $5 is the fifth field which is the marks. So for every record, the marks which have been extracted here will be added to a variable called count marks which would have been created automatically by AUK. The key is here. That's count $4 plus equal to $5. Remember, I don't need only total for all the students together. I need batch-wise totals. Now how many batches are there? I know there are 40 batches but the AUK doesn't know it. AUK will determine the batches by the records that it reads. Consequently, the first record that is read, suppose that is batch 0A, it will create an array called batch total and it will index it with 0A, not 1, 2, 3, 4, 5. Suppose the second record read is that of batch 5D, then it will create a second element of that array and call it 5D and add 1 to it because all array elements which are created are automatically set to 0. Suppose there is a fifth record which comes up again with a batch 5D which was already there. AUK will remember that 5D element has been created. It will go to that element and increment that value. Consequently, when you have completed scanning all the records, AUK would have got the correct totals batch-wise in this array. It would also have the count of number of students in another variable array called batch count. This is an array where obviously you are incrementing it 1, 2, 3, 4. Why? Because you want to find average. For every batch, the average will be found by taking the batch total and dividing it by the number of students in that batch. What better way than to use an associative array where the index is the batch code itself? Notice that if there are 500 different batch codes, 500 elements would be created. If there are 5 batch codes, 5 batch codes will be created. How many batches do we have? 40. Okay, 0A, 0B, 0C, 0D. But wait, we just saw a goof up. Let's go back to the previous slide. And one more. Here we had a person, Shalind Sabha. Is he there? Shalind Sabha, what batch is he? What is your batch? Okay, fine. Well, this is not his mistake by the way. He had correctly written the batch on his paper. But my staff, when entering the data, did not enter it properly. So for no mistake of his, he gets eliminated from the statistics calculation. However, since such data may exist, what will happen is that this array will not have 40 elements, but will have 41st element, because Hock will think blank is a batch code, and it will allocate an element of the array to blank batch code, and it will count. So if there are 2 or 3 blanks, all of them will come together, and their average etc. will be calculated. Obviously, this is not exact result. However, this is very close to exact result. Of course, this is computation. The computations are done just by this one line for upset students, and these 4 statements for all present students all are attended records. There is a new pattern called end. End is a pattern here. End pattern matches with the end of file. What it effectively means is that when the file is completely read, and there are no more records, Hock says now I am expected to do something which is stated as axial along with the end pattern. So Hock will come here now, and it will do all these activities. What is it doing? For I in batch, notice we say for I equal to 0, I plus plus, I less than equal to something, when the index is an integer. Is the index integer for the batch count and batch plot? I don't know. The index is a batch code itself. We don't know in fact whether it is an integer number, or it's a text or whatever it is. We don't know how many elements there are, but Hock knows. So it provides a simple syntax. For I in batch, what it means is as many different elements in the array that you have created, and as many unique indices that you have, go over all those indices one by one. So for I in batch means I will be equal to 0, 0B, 0C, 5D, 6C, 1B, et cetera, et cetera. Not necessarily in the sorted order, but in the order in which they were encountered in the need. And for each value, it will do the following. What? Print I. Notice I is not integer. So it will print actually the batch code. They will print the batch count. How many students in that batch? And then it will print the batch top divided by batch count, which is the average. And for every element in the array, it will do that. At the end, for good measure, we ask Hock to print total students, which is count plus absent count. Number of absent is so much, and class average is so much. That's it, the program ends. To execute this program, I have to give a simple command like this. N Hock or new Hock. That's the current version of Hock. Minus F double code, comma double code. This capital F indicates the field separator. Remember, Hock takes blank as a normal separator, default separator. So if I have blank separated values, I don't have to say minus F anything. But if I have any other symbol, I can state it here. So minus F, this means comma separator fields. Minus small f, this means the entire Hock program that you just saw is available in this file. AnalyzeMidSame.hack. Could be any arbitrary file name that you choose. This will ordinarily work on input, and this is the standard input. But when I use less than MidsameMarkz.txt, that's your Markz file, the entire Hock scripts executes. And when it executes, it produces this result. Observe that this first record corresponds to the batch total of blank batch. And our friend is there in that batch. So the average here, because there is only one student, means the marks of that one. But if there were three or five, however, all subsequent records give you exactly the batch totals and batch averages. So batch 0D has 20 people, 24 batch average, batch 0B has 15 people, 30 batch average, batch 2C has 21 people, and so on, 3A, 2D, whatever, whatever. Notice that they are not necessarily in sorted order. And finally, after batch 0C, probably the first record was, for 0C came very late in the record file, total scoons are so many, number upcent is so many, class average is so much. Do you like this programming language? So Hock is so simple and works so well, so why don't we use it? Why do we have to learn complicated languages like C and C++? Why do we spend one entire semester learning a complex thing when in five minutes we can learn a language and in 20 minutes we can start using it? Remember the first quiz that I had asked you in one of the early lectures, how many programming languages are there? Most of you could not guess correctly. As I had said, there are more than 600 programming languages. Some 100 of them are used by a fairly good number of people. About 20 of them are used by a large number of people. But about 2 or 3 or 5 are used by a very, very large number of people. And there are reasons for that. The reasons, for example, for less usage of Hock, is while it is support for such problems, but it has limited capability to handle data of all kinds and to implement all functionality to solve general computational problems. Try to implement a multi-precision arithmetic of 100-digit multiplication or addition. Try to handle complex decimal numbers. Try to handle abstract data types with complex numbers and geometric figures and so on which object-oriented languages permit as we shall see. So it is good for certain type of activity, but it is not good enough. And that is the reason why generally we don't prefer it for generic purposes. The second thing is it's an interpreted language. We shall see the difference between interpreter and compiler later in the course. Suffice it to say that a compiler, like when you say C++ something, something, the compiler produces a machine instruction in the native form of the machine. Those machine instructions execute very rapidly. Whereas an interpreter does not create an object code or machine code. Observe that for your C++ program, after saying C++ something, something dot CPP, you are required to execute the resultant machine code by saying dot slash A dot O. That A dot O is nothing but an executable code. There is no such code produced here. Ordinarily whether you compile it first-hand and then execute the code or you interpret, which means you translate on the fly, why should it make a difference? Because all statements have to be executed anyway. The simple thing is in any iterative statement, suppose an iteration has to execute million times and there are four instructions within the iteration, all instructions would be compiled for once in the compiled language. But Auk will interpret each statement every time, million times to translate it into machine code. This is a simplified example, but that's one of the reasons why Auk is not very popular. So operationally it is less efficient. I spent this time just to make you appreciate that data analysis problem typically can be solved easily by such programming languages. However, since you and I are destined to discuss C++ and such features, we'll go back to our CC++. I will continue expanding on the notion of pointers which I introduced briefly last time. Assume that the machine is byte addressable, then the addresses of consecutive locations of these data objects in memory drivers will differ by four if four bytes is the occupancy of an integer number or a floating point. If something occupies eight bytes, it will differ by eight. Something occupies one byte like a character or a pointer as we shall see, it will differ by one. However, we generally will not be required to know how many bytes the size of a particular item is as we shall see. This is merely a sample layout assuming some arbitrary value of 10,000 for the location address of M which contains 573. N will contain minus 1, 2, 3, 4, 5, 6, 7 and the starting address of N, if N has been allocated memory in consecutive location after M then the address will be 1, 0, 0, 0, 4. Suppose on an array A of three elements and if the allocation is consecutive A0 will go to 1, 0, 0, 0, 8. A1 will necessarily go to 1, 0, 0, 1, 2. So, these addresses need not be consecutive where variables can be placed by C++ compiler anywhere in the memory. Array, once the base address of the first element A0 is decided all subsequent elements necessarily occupy consecutive storage and that is the reason why the pointers while dealing with arrays become extremely important and fruitful. So just remember this M has a value 573 N has a value minus 1, 2, 3, 4, 5, 6, 7. We can declare pointers which can store addresses of locations so that they indirectly point to the data values. So the declaration is done like this int star ptr1 star ptr2. Star ptr1 is a pointer but it is a pointer which will point only to integer objects. It will not point to a character object or double or float object. If you want that you will have to say float star ptr3 string star ptr4 etc. The way you allocate values to pointer is by using what is known as an address operation. So if M is a variable M cannot be assigned to a pointer because pointer contains address M contains value but when you say and M when you prefix any variable by and then that means the address of the entity is concerned not the value. So address of M will be calculated and it will be stored in ptr1. Similarly address of N will be calculated and will be stored in ptr2. If we go by the allocation of storage which you have just seen in the sample thing remember these are arbitrary allocations then in 10,000 if I have 573 value and 10,004 minus 1,2,3,4,5,6,7 not only I can access these values by referencing them as M and L but because now I have pointers in ptr1, ptr2 I can also access them using pointer. The way I refer to them is to say star ptr1. If I just say ptr1 it will mean the value 10,000. If I say star ptr1 it means the value inside the location pointed to by this pointer. So it's an indirect access. Star ptr1 effectively means M if that is what address I have assigned earlier. So I can see out star ptr1, star ptr2. I can even print out ptr1, ptr2 but they won't make much sense because the actual memory allocations will depend upon the memory addressing scheme of different machines and so on. But still let us look at an example. I have here M, N and three elements are A. I have assigned M as this, N as this exactly like in the example. I now get two pointers. I print out M and N just for the sake of knowing what I will get. Then I assign pointer1 to address of M and I simply increment this pointer by 1. I hope that this pointer ptr2 should point to N because I have declared them consecutively. I do not know that. Just to verify, I will print out the pointer values and I will print out the corresponding data values which are star ptr1, star ptr2. And let us see what happens when I execute this program. Are you with me on this? This is very simple stuff. What I get however is very interesting. This is the name of the file pointer example 1. I will put all these files in the website. And when I execute this program, M and N values are printed as we are asked it to print. The pointers are these values. It is very interesting to note that these values of addresses are usually maintained and printed in hexadecimal number system. What does hexadecimal number system mean? The base is 60. The 0x is not part of that number. 0x stands for hexadecimal number. So it is a qualification. Because if you just write bfcaf8ac, it could be a character string also. 0x means what follows the hexadecimal number. We need not convert it into decimal to understand what that value is. Suffice it to say that this is pointer 1 and this is pointer 2. And these two pointers differ by how much amount? This is ac. What is the next value? c, ae, af and b0. Observe that I added 1 to the pointer. But the difference in the address I find is 4 bytes. Why? Because the pointer was integer pointer and integer occupies 4 bytes. If there was a data item which occupied 8 bytes, I would have seen the difference of 8. So I have indeed the consecutive locations printed here as pointers. However, when I look at what is printed for star ptr1 and star ptr2, I am disappointed. I guess star ptr1 has 573, but star ptr2 has minus 10772, 17072. Does this look very similar to minus 1234567? No. What did it do? Quiz. Did it add some number? Did subtract some number? Did multiply some number? Should we have added not 1 but 4? What has gone wrong here? Why am I not getting the value of L? Any idea? The correct answer is that the reason I am not getting the value of N here because pointer 2 is not pointing to N. Remember, let us go back to the previous program. We assumed that when we added 1 to pointer 1, we assumed that M and N have been allocated consecutive locations. That is an assumption. Remember what I said. If there are other elements, there will always be allocated consecutive locations. But variables and compiler may keep anywhere. It so happens that this C++ number does not allocate the address of N next to M. It puts N somewhere else and that is the reason why we are getting something arbitrary. I would also like to point out that suppose in this location, suppose there was a value minus 1234567, incidentally at that moment in time, you would have got that print out and you would have believed that yes, by adding 4, I always get consecutive locations in my variables. That is incorrect. This luckily says that that assumption is not correct. So we do a variation. What we do this time? Same program, but just as we assigned N to PTR 1, instead of adding or subtracting, I assigned N explicitly to PTR 2. Now there is no doubt. PTR 1 should point to N. PTR 2 should point to N. When I execute this program, indeed I get 573N minus 1234567. However, if you observe the values of the pointer, forget the earlier part. This is 2C alright, but this is 2A. This is a completely different location and this location comes before this location, not after this location. So we have to remember that compilers are at liberty to allocate memory and therefore we should never insist on knowing or doing anything with the absolute values which the pointers contain. All that we should be sure of is that pointers will contain consecutive values provided they belong to elements of array. Here is an example. Continuing with the same problem, this time I assign values to my array A0, A1, A2. 573 minus 1234567 and 94. This time I have 3 pointers. PTR 1, PTR 2, PTR 3. I do what I did in the first example wrongly. This time I assign values of A0 to PTR 1. But now I want to PTR 1 to get PTR 2 and I want to PTR 2 to get PTR 3. This time I am confident that I will get it correct because I am dealing with array elements and I know that array elements will always necessarily occupy consecutive locations. Just to make sure, I print the pointers and I print the corresponding data value. I expect to get the same data values as this, this and this. 573 minus 1234567 and 94. Let's confirm. Indeed these are the original values. These are the pointer values and the corresponding data values are this, this and this. Correct values. Just to show you that the compiler could assign different memory locations every time I run my program I am running this program once again. I forgot to put a new line character but when I run it second time notice that the pointers are different. This time A0 was at 5C. This is this time it is at 1CC. This was B60, it is now 1D0. So every time I execute the program depending upon what memory is available at that time the operating system will load my program in that available memory. Assign some base address to the whole program and the compiled program will automatically find out where different data elements are to be kept vis-a-vis reference to that base line. It's an important understanding that we should develop. Is this clear? Now we come back to the passing of parameters to functions. So far we have seen parameters which are passed as values. We know that the only thing we will get back is a single value. So if I had an arbitrary function like 5x plus y modulo 10 then I can write int f and the function will be written as parameters int x int y f will be calculated return f. To use this function with integers a, b and to calculate the result function value in c I will read a and b. I will say c is equal to f of a and b and I will output c. This is well known, this is standard stuff. The point here is you are giving two values to the function a and b and you are basing your calculations on the resultant value which the function gives you. The resultant value type is defined here. It could be int, it could be double whatever. What do we do if we need to get back more than one value upon return? Consider this. I want to swap values of two variables x and y. Please note that if I want to swap these values I will ordinarily write a program like this. Int x y I will have a temporary variable. I will put the value of x into that temporary variable value of y into x and then finally value of temporary variable into y. This is a standard tactic of slapping numbers. What is the result of this? There is no separate result. X becomes y, y becomes s. That's all. However, if I wish to write a swap function we might try writing like this. Int swap, int p, int q corresponding to int x int y. I require a temporary variable. I call it int t which is declared inside this function. Now I do the same thing. t equal to p, p equal to q, q equal to t. What should I return? Well, there is nothing to be returned. I might as a return 0. It doesn't matter what value I return because no return value is expected. This function will expect the p to become q and q to become p because the way I will call this function is to say int x y, read x y, swap x y. Will this swap x and y? The answer is no, it will not swap x and y. That is because when we invoke this function swap what happens is a value of x is copied and given to that function. Value of y is copied and given to that function. The function is operating upon the copies of value. The original values of x and y remain intact. We don't want this. So even if the function swaps the copies, the original values remain as they are. They are not copied back. The resultant values are not copied back. So whenever we want multiple values to be brought back even a single value to be brought back through parameter ordinarily there is no mechanism to do that. That is where we use pointers. This is a program which does not work. Try to swap. I have written exactly the same thing one after another. This is the main program, this is the function. This is called by value. That is because x and y are transferred to try to swap as value copies. Values of x and y are passed to the function which initializes local variables a and b using these values. Updated values do not get back. Here is an interesting thing to note. Aside from the function, computation and parameter passing, I have used the word void here. Void means that function does not return any value. Obviously if I want to swap variables, there is no computations to be done to be returned. Invariably you will use the word void to indicate no value is returned. That is why the word return is used, not return 0, return 1, return something. Because nothing is to be returned here. This is incidentally an addition of information. Now passed by reference, the motive is how do arrays pass to a function come back with modified values? Remember a function which we wrote to read data from a file? There is a separate function. We sent an array to that function. It ran the data from that file into that array and when the function came back to us, we heard the array containing the correct value. How did array bring back the values? Why can't simple variables don't bring back the value? That is because the array is not copied. Instead, array name is automatically treated as a pointer to the first element of the array and then passed on as a parameter. So what is passed on as pointer? So whenever you are dealing with an array inside a function and if you are modifying any elements, please remember that the original elements of the array are getting modified because what the function is working on is only address pointers. For the simple reason that it is very costly to copy entire large arrays, people say let's pass them by pointers. However, you can pass other parameters also by pointers if you want to get the values back into your calling program. This is a program that works. I say why it swap all right? But the parameters are not int a int b. The parameters are int and a int and b. Remember what and a and b means? Address of a address of b. And then of course I work ordinarily as a temp equal to a equal to b, b equal to temp, etc. Calling wise when I say swap x and y, is this a correct difference? Okay, spot the error that is homework. Yes, you should not pass the values but you should pass the pointers to these. I am sparing time is because I have several important announcements to make. First, once again the last call. Most of you have submitted the four member teams to your lab tier. If you have not done that, this is your last chance to do because on this format I must get all the information. Remember that a fifth person only for some batches. That too when all students of that batch are accommodated. I understand that there are a few students who are moving around trying to figure out which batch they can join because since they don't know anybody in that batch well, nobody wants to keep them in that team. What I will do is at the end of this Saturday on Sunday when we meet as per the scheduled meeting all such people will be force allocated to some arbitrary batches. So it is better that you figure out and attach yourselves to the batches so that there is some coherent if possible. The schedule for this meeting, once again I repeat from 4th October, 10 to 12 noon. This is in the case it auditorium. The case it auditorium has capacity of 210. There are going to be exactly 200 batches, 200 teams. 5 persons, 4 persons per team, 5 teams per batch. So there are 40 batches and therefore there will be 200 teams. I want exactly one person from every team. It is possible that the team leaders may not be free tomorrow. I have come to know that some people have NCC or something like that. In that case, please send anybody from your team of 4 or 5. In case all 4 or 5 are going to do some parade on this Sunday please send some other friend with a placka and then I will belong to this batch for today. But I want every batch, every team to understand what projects they are going to do. There will be 4 projects which will be described briefly between 10 to 12 and between 12 and 12.30 each team will have to give a choice through the batch coordinator. So for every batch, the batch will give choice of one project 1, 2, 3, 4 in whatever order you prefer. And you will of course get one of those. The idea is that on any one day between 6.30 to 8.30 all students assembled in that lab will be working on these 4 projects. The next evening the same 4 projects will be done by other batches and so on. I will explain the project concept later. But this is important and there must not be a single failure. This attendance is for one's compulsory. But do not at all have 2 people from the same team coming into that hall. Because apart from these 200 people, about 40 lab TAs who are your TAs are also going to attend. Several of you will have to sit on the staircase but we can accommodate up to 240 people so that's not a problem. This is another important notification. Department of Computer Science is collecting feedback on all CS courses as to how they have been conducted so far from beginning up to midterm. This is an anonymous feedback. So your name or roll number etc. never gets recorded in that feedback. However, at the time of logging in your feedback, you are checked whether you have already given the feedback or you are a valid registered student or not etc. Please note this site. http colon slash slash 10 dot 105 dot 12 dot 100 slash feedback. So just throw a browser, go to this site and you can give your feedback. This is extremely important for us teachers. For example I do know several shortcomings which I get to know through the interaction with some of you and with my TAs but there are several other shortcomings which I may not know. There may be a few good things that are happening in the course which I may not know. The feedback is vital and other than to wait till entire course is over the department thought it fit and very rightly so that we should collect feedback for every course that the department conducts so that teachers have a chance to correct the direction in which the course is moving. So please do that. What I said before Monday is that Monday is the extended deadline. I did not know that you people have not been made aware of this feedback. What happened is that the mail circulated within the department. So the second year students, third year students, semotextiles everybody knows but since first year students do not belong to a department in fact they belong to many departments they did not know it. So please do fill up this feedback. It is absolutely vital and important for us. The last announcement. This will take about 3 or 4 minutes because it is a very sad announcement that I am making. I have thought that I will never have to make this announcement. You remember at the beginning of the course I said that people will come with varied background, people will come with varied skills and talent and there would necessarily be people who will understand programming very quickly and very well and some others who may not understand it very quickly and very well. Unfortunately that will reflect in marks. I have also assured you that as long as you work diligently nobody in this course will fail and in fact there will be enough chance with better work you can better your grades. The only expectation I had said is that you work on your own except when you are asked and advised to work in groups like in group projects. All assignments, quizzes, mid-sem exams etc. etc. are supposed to be individual work. Unfortunately it has been noticed that some assignments have been copied. There have been some tricks and such tricks have percolated very quickly. What we did not realize is that in your home directory if you just say cd dot dot you get to the top level directory and you can see all students home directory and then you can go to any directory look at the assignment whichever you like you may be tempted because you have struggled you have not got it right and you don't have much time so you may copy it test it and submit it. The moment you do that you immediately get a fail grade. Remember I had very specifically stated that this rule shall have no exception. Unfortunately if I had to fail 10 percent 20 percent of this class in the middle of the semester I hate it. I will feel very sad and of course those who will fail because of such thing will feel sadder. We are setting up an effort to automatically compare all the submissions that you have given. As we speak our TA's are picking up open source software some tools and writing some programs to do that. We will complete this work in about 10 days time. Once we find out all the instances of identical code portions there need not be complete assignments significant portions and I shall be the decider. I shall notify all the concerned students and the academic office. Of course if there are five identical copies it is quite likely that at least one is original because without original there could not have been copies and naturally our attempt will not be able to discover who is the original. So we will allocate fail grades to all the five. We expect the person who wrote that assignment originally to come back immediately shouting at us saying that no I have done this. In such a case personally I have to provide convincing arguments of his or her authorship of the program because it is not unlikely that the original author picked up that program from some other source on the internet. I need to be convinced that the effort is original engineering. If this does not happen people will get a fail grade. Now I thought a lot about it last two nights I practically not slept. This night I am not slept at all. Last evening I discussed with my TAs. They did say that they agree with this philosophy of punishing malpractices but they thought that because these are first year students and they could have yielded to a temporary temptation and they may not really mean bad. I am not convinced of that but I bought that argument and I am providing the following concession. Since it is likely to be the first exposure of such expectation that you have to do rigid self-worth some may have been tempted to take requests to copy. Such students on their own may submit an apology along with the following information. Your own number, name and lab badge, lab assignment number for copied program code, partial or complete and the source. This information should be submitted, written on a plain paper and the paper should be kept in a sealed envelope and submitted to any senior TA on duty in the lab latest by 20 to 30 hours on or before Tuesday 13th October. By Tuesday 13th October that is roughly when I expect our detection effort to complete. All such students who render an apology giving details will not get a fail grade provided they do something more. What is that something more? They will be given additional 2 weeks to submit these assignments fresh. Please understand the purpose of the assignment is that when you try them you learn something. When you copy you at most read that assignment you learn only partially. It is my endeavor and it should be the combined endeavor of all of us that each one of us learns as much as is possible. That is the objective of this course. So therefore all such students will be given additional 2 weeks to complete this assignment. The apology later followed by completed assignment submission the assignments need not work correctly that is fine. I always said you can submit partial assignment. You may not get one mark for that assignment you may get half mark. In this particular case because these have been copied the marks allocated to that assignment will be 0. The total marks are not very many by the way. I already removed 5 marks of the 10 marks of assignment there are totally 5 marks out of those 5 marks the best 5 assignments will be counted and this is already the 7th assignment or something I don't see any reason why all of you should not be getting 5 marks. But if any one of you has got 5 marks out of this kind of mischief then you decide to get 0 at least in those assignments which you have copied provided you apologize agree inform what you have copied and do the assignments. But anybody who either does not submit this apology or submit the apology but does not complete the assignment in additional 2 weeks will guaranteed get a fail grade. We will find it ourselves by Tuesday night and we will be working throughout the midnight there to find out later. By Wednesday morning all such people whose copies are caught should expect an email from me. Either there should be an apology later with me with an undertaking that you will complete this assignment or Wednesday morning you get a fail grade. Because I am giving this concession this time I will not stop with fail grade I will recommend to the director of the institute because giving a fail grade is my prerogative but doing anything beyond is the job of the senate. I will inform the senate chairman that in spite of this warning people have not been forthcoming and they have not apologized stricter action such as explaining such people from the institute should be considered. I am very sorry to say but all of you I expect to be great leaders in your own fields when you pass out and one of the most important aspects of leadership to be learned now is basic ethics. If you can't follow basic ethics you are unlikely to do great things in life. Whether you want to do great things in life is your prerogative but in my course you will not do this nonsense. Everybody who has done that will get a fail grade now or even later for anything else. This is one thing I cannot tolerate I am sorry to have taken a lot of your time. That's it. Thank you.