 So, welcome back. So, there are some questions. I will answer the questions first and then we will get back to this. So, there are several questions on the chat session. So, one of the questions is this. I have just switched to the I Python interpreter. So, let us say I have this space. Notice that there is a space here. Join. Hello. Without giving it a list, that will give you space in between every single character of the string. The reason is if you look at this string that is given, the sequence that is given to join, it is not a list. Instead, it is a sequence itself. So, it is going to replace join each of these characters with the space that is supplied here. Otherwise, I think what you are facing is there is no issue there. I mean to do join off with bracket hello world. Yes, you are correct, but if you join it explicitly with a single word, it is not going to do it the way you explained here. So, I am not sure what you really have is probably something erroneous you are typing up. Yeah, there is no way for you to get he hello world with space unless you do something like this. Unless you do that, you cannot get it. Because if I did this, maybe you forgot to put some quotation mark somewhere or a comma somewhere. If not, there is definitely a mistake in your output and everything counts. In any programming language, if you miss a comma somewhere, you miss a quote somewhere, everything is going to change. So, please be careful when you are reporting any errors. The next question is answer for range 5 comma 2 is empty. Obviously, what is the range syntax? Range question mark clearly says it is start which is optional, stop not optional, step. So, we give it just two arguments. You are saying start at 5 and end at 2. How will it print anything? Do not print anything. No, you cannot assume 5 to be the start. So, you cannot assume whatever is see there has to be a unique mechanism by which you are able to identify which is the which is what argument. So, when you give arguments, if you do not give it a name, if you say range 5 comma 2, 5 and 2 are what are called positional arguments and they will order matters there. You cannot assume that one will be stop and one will be start. The first argument if you if you do not supply anything else, if you supply only range of 5, then stop will be 5. But, if you supply 5 comma 2, then it means start is 5, stop is 2. If not, there is no unique mechanism by which I can identify what is going on. So, python almost always you will find that there is one unambiguous mechanism by which I can identify what the code is going to do. Therefore, when you do range of 5 comma 2, it is basically taking 5 as the start, 2 as the stop. I think that there is one more question. Is there any IDE for python? There are plenty of IDEs for python and you could use Vim, you can use Emacs, you can use Eclipse, you can use something called spider, you can use PsyT, there are many many editors. There is wing IDE, there are commercial IDEs, there are free IDEs, all kinds of editors are available. I typically use Vim, I used to use Emacs and there are plenty of IDEs. So, if that is all the questions, I am SDCEK is. So, I think you should ask somebody locally there because I think you are not, it is not clear to me that you are typing exactly this code because if you say space join hello world as a list, then it should then what you are getting is correct. When you join one list, yeah it is yeah. So, there is some mistake in obviously what you are doing. See for example, if I did this, if I have hello world, if you look at the shell, I have space here join hello comma world as two words in the list, I get this. But if for example, I forget this comma that also will not work. But if I put this, I will get this. So, maybe you missed a bracket or you missed a space or something or a comma or something like that. Okay, there are some in person questions, I will take them now. PSG college Coimbatore, you have a question. The question is in print we are able to print one line output in print we are able to print one output per line. Is it possible to print several outputs in one line? What is the method? Okay, so let me ask you a question. What do you mean several outputs in one line money? You want to put several quantities in one line is it? So, for example, if I do print, so if I do print hello comma world. So, let us say A equals 1, B equals 2, C equals 20, 1, 2, 3. I can say print A comma B comma C. I can do that. But if you want to specifically formatted in a particular fashion, what you can do is you will have to construct a string and I do not know if we explicitly have slides for that, but maybe I will do it. If you are familiar with C, what you can do is you can say percentage, if you take percentage S, percentage D, those are all pulled from C's syntax. If you do percentage S, it will try to represent the quantity as a string. So, let us say I want to do this and then I put a percentage at the end and give it whatever arguments. It will try to format these quantities into those arguments. For example, I am giving it A, B, Q. Q is say hello world. Now, if I do percentages comma percentages comma percentages close quote, percentage A comma B comma Q, it will actually try to represent that string itself as a string and put it. The other option you can do is you can say print, there is another built in called str which will convert any argument you give it to a string. So, if I give it Q, it will give a stringified representation for that question Q, for that quantity Q. So, for example, I can say print A comma B comma str of Q. That will also work. So, there are many ways to do it. You could either reduce it to one string and print that string or you can give print whatever arguments and it will try to print the best it can. Does that answer your question? I will ask the second question in continuation of the first question itself. Suppose that we have got the syntax of for i in range 1 to 10 and we have to print the numbers 1 to 10 continuously. So, then what you do is you say print, for i in range 5, print i comma. If you give it a comma, it will not put the new line. Otherwise you have to construct a string. So, if you want to do something more fancy than this, you construct a string and then or you save the output. You say you put i into some other data structure and then you generate a string. For example, you can add these i elements to a list and then you can join them and print that string. So, there are many strategies that you can employ. So, Srinagar has a question. The question is, is there a room for post increment while you are inside the loop? Can we post increment a variable? Post increment. As in, can you do, are you saying A plus plus or something like that? There is no plus plus operator, but there is an A. So, you can say i plus equals to 1 or i plus equals to 2. Oh, you want to print the value as and then do, no, no, no. You do not do that. No, no. Firstly, doing this post increment, pre-increment is bad programming in my opinion. In C, there are many things that you can do which sometimes are not very good programming practices. For example, this whole notion of post increment and pre-increment, supposing there is a beginner programmer you catch, you ask him, what is the i plus plus or i plus plus i? There are lots of problems with that. Firstly, even if you are looking at it in terms of performance, there will always be the question, should you do this, should you do that? So, in Python, there are no, there is no such post increment operator to answer your question directly. The way you do it is, you explicitly say what you want. So, there is no, what you would want, you would basically say print i and then you do a i plus equals to 2. So, the idea is, you do not have a post increment operator. Typically, the idea is you are not trying, see in C, often the need for all of these tricks comes from performance whereas, in a language like Python, what is more, what is the thing you are focusing on is clarity. So, anything that is ambiguous is not, is usually not encouraged. So, PSD has one question. Yes, PSD College Coimbatore. I got the question. You want to know about extension modules written for Python. Is that correct? You want extension modules for Python. Extension, C extension modules are not going to be covered in the duration of this course. There is plenty of information available online. It is not a trivial, it is not an easy subject that I want to spend too much time on right now. Let us just focus on the language. There is lot of information available online. There are many, many ways to create C extensions. You can write them hand coded C extensions in C using this Python C API or there are other languages like Scython which will allow you to create extensions very easily using Python like syntax. But unfortunately, to be able to do that, you will have to first learn Python, which we will first do. Later on, just go to scython.org. It will show you how you can create extension modules for Python and link it to C and things like that. But we will not be considering that right now. Okay. I think that is all that we have for questions. So, let us move on to lists. So, creating an empty list is the first thing you need to learn. It is a very useful thing. Many people often forget how to create an empty list. It is trivial to create a list. It is empty. Just put open bracket, close bracket. That is all there is to it. Okay. Similarly, we have seen many examples in the past of creating lists with certain number of elements. So, we have say p is 1, 2, spam, eggs. Okay. So, notice that the list can have various types of quantities. It is not restricted to one type. So, I can have integers, lists, floating point number, strings. I can put, if you look at the second example, q is 4234 and 1234, 1234. Clearly, you can see that q can contain a list itself. A list can contain another list. So, lists are basically collections of the heterogeneous collection of elements. And these elements can be anything. And list could also be empty. It is without any elements. Okay. So, how do we access elements of a list? Just as any other sequence. You can give index indices starting with 0 all the way to the last minus 1. Anything beyond that is considered an index error. So, if you try p of say 40, it is going to give you an index error. Negative indices, again, we have spent a lot of time discussing. Negative indices behave the same way as I showed you for the strings. So, minus 1, minus 2, minus 3, minus 4. Similarly, if I do p of minus 40, it is also an index error because the index actually has to be inside that particular list sequence. And this information is common to all kinds of sequences. So, in summary, indexes start from 0. They can be negative. And they should be in the valid range. To obtain the length of a sequence, again, you use the len keyword. Sorry, len built in. So, len of p in this case is 4. Now, unlike strings, lists are mutable, which means I can change a list in place. So, one very common method, how do you find out all the methods of a list? Remember what p is? p is this, p dot tab. So, look at all the methods that do not have an underscore. So, p dot append, p dot count. So, let us look at append, p dot append question mark. It says append object to end. So, if I say p dot append, now p has one more at the end. I could also say p dot append, now p has the list itself. Now, to remove an element, I can say del p. Let us say I want to remove the last element, p of minus 1. Or if I want to remove the first element, I can say p of sorry, second element. This will remove the 2. Or I can say del of p of say minus 1. Now, that element is gone. In addition to this, del is a keyword. But if I want to remove something by value, I can do p dot remove. And it will remove the first occurrence. It will not remove all the occurrences. To find out more, p dot remove question mark, remove first occurrence of value raises value error if the value is not present. So, if I do p dot remove 1, we will say value error x not in list. So, adding element is as simple as using append. Removing, you can either use the del keyword or you can use remove itself. This in addition have many other methods. There is count. Count will count. Say for example, if I say count spam, it says there is one occurrence of spam. Then p dot extend. So, for extend, I can give it a list and it will add all the elements of this sequence that I give it to p. So, for example, if I do p dot extend, it will add h e l l o as individual elements. So, when you say extend, it will take every element of the sequence you provide and append that. If you say append, it will simply add that one element. Then we have p dot index. So, supposing I want to see where spam is or where h is, it will tell me that the first occurrence of h, it will tell you. So, it takes certain optional arguments called start and stop saying it will find elements within that start and stop. If you do not specify them, it includes the entire list. So, p dot index will tell you where the value is. If the value is not found, it will raise an error. I will again give you an error. There is pop, which will basically remove and return item at the optional index. If you do not specify the index, it will be the last element. So, if p is this, if I say p dot pop, it removed o and return that value. Remove, we have already seen. Reverse will reverse the elements of the list in place. So, p dot reverse will not return a new list. It will reverse this same list in place. p dot sort will sort the list again in place. It will not return anything. It will use some kind of sorting and you can control the sorting by giving it an optional function. For more details on that, I would suggest you look at the documentation. Just do question mark, it will tell you. If you want, for example, you want to sort it with reverse, it will now sort it in reverse order rather than in ascending order. It will sort it in descending order. So, my intention is not to cover all the methods of lists, but the intention is to show you that there are various methods. The easiest way for you to learn is, you take a list, create it, experiment with it on the interpreter, read the documentation on the interpreter and explore by yourself. Then you will understand exactly what it does. I will only cover the important methods that are most commonly used. Now, earlier we saw that there are certain simple operations you can do with lists. So, let us say A is range 5, B is range, A is this, P is this. Now, if I do A plus B, it will basically concatenate the two lists into a new list. So, when I do A plus B, it is creating a new list. For example, if I were to say C is equal to A plus B, C is this, A is this. Now, if I say A dot append 5, C is still unchanged. So, which means C is a fresh list, it has nothing to do with A and B, except it has a copy of all of those elements. Now, slicing and striding again are very similar. I will go through this very quickly and I will ask the participants here to answer to me and I will assume that at your end also you are able to follow the slicing and striding. We will just go through the examples to refresh your memory once. So, let me say primes is, I will first clear. So, what is primes 4, 8, 4 colon 8? What will it be? What is primes 4 colon 8? 11, 2, let see, correct. What is primes colon 4? 2, 3, 5 or 2, 3, 5, 7. How many of you say 2, 3, 5? Because remember, these are indices. 4 is the index. So, it should go 0, 1, 2, 3. So, you will have 4 numbers. So, this will basically print the first 4 elements of that list. That is one way of thinking of it. So, num is range 12. So, what is num 1 colon 10 in steps of 2? 1, 3, 5, 7, 9. What is num of colon 10? 0 to 9. What is num of 10 colon? Num of 10 colon. 9, 10, 11, 12. 10, 11. Remember, num is 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11. What is num colon colon 2? All the even numbers. And what is this finally? So, this is list in reverse order. Very good. So, we have covered this in great detail. Sorting, again I indicated hinted at the methods here. So, given A or let us say primes, let me create A is 5, 1, 6, 7, 7, 10. A dot sort, remember A dot sort does not return a new list. So, A dot sort gives us nothing and A itself is now changed to 1, 5, 6, 7, 7, 10. Now, if you want to get a new list, there is a very convenient built in method function called sorted. So, if I take A is this, sorted of A will return a new list that is sorted. And note that sorted will work for any sequence. For example, if I do sorted, hello, that will give me a list of the characters of hello sorted in ASCII sequence, not in, for example, if I did it is sorted in ASCII sequence. If you want it to be case insensitive, you have to make it case insensitive by changing it to lower case or whatever. Similarly, if I give it a tuple, it is going to return a list. So, remember that sorted always returns a new list in sorted order. And if you check on the documentation of sorted by doing question mark, it will tell you that you can also configure the sorting order by giving your own comparison function or giving a key or asking it to do it in reverse sorted order. So, if I say sorted reverse equals true, it will give me reverse, descending order and not ascending order. Next we look at reversing. So, reverse is simply identical to A colon colon minus 1 except reverse will do a reversing in place, which means it is not going to create a new list. It is simply going to reorder all of the elements in the same list. However, when I do A colon minus 1, it creates a new list, a copy of the list. Before we go to IO, there is somebody who is asking a question on giving an example for nested looping. So, I might as well do that. It is straight forward. So, let me define a nested for loop. So, for i in range or let me say for i in x comma y for j in x comma y for j in x comma y for j in x comma y for j 1 comma 2 print i comma j. This is a nested loop. So, here is a simple trivial nested loop. Now, I can you can keep on going. So, I can say for. So, you can clearly see that the innermost changes fastest. So, if you look at this example, first i trades over k, then over j, then over all i's, keeps repeating. So, ij, ij, ij, ij, ij, 1 1 2 2 x x x x x x, y y y y y y and there is a simple nested loop. You can nest as far as much as you want, there is no issues there. Similarly, you can put in conditionals, y's wherever. So, they are just like any other programming language very straight forward. We will move on to IO. IO, we have looked at part of input output so far by looking at printing, simple forms. We have just done print of some variable. Now there is something that you need to notice, which is mentioned in the slide. If I say when I do this, it gives me a different type of output as when I do this. So, basically when I do A, when I type A, it is simply showing the value and this is particular to the interpreter on when I start up I Python like this and I type A, it shows me the value. But if I were to do this, say on a program and I am executing the program, just typing A is not going to give you anything, unless you explicitly print it. But when you print A, it is going to print the value of A. So, when you do this A like this, this is only valid on the interactive interpreter. If you run it as a standalone Python script, which you will see shortly, none of this will happen. You will not get any of the outputs. If you want to explicitly print it out, you have to say print that value. And one thing while I am at it, right now I can say print A, but in Python 3, for example, this is changed. Print is no longer a keyword, it is instead of function and you have to do print of A like that. But some versions of Python also support it, so that you are able to change your scripts. So, it is a good idea to even right now do print of A like that. So, similarly if I take B, so this is better illustrated in the second example and here is a new line. B is this, but if I do print B, it will actually print that along with the new line. So, we have string formatting here. I will do this in a little while. There is a question now, how to insert value at any index in the list? Please look at index method of the list and decipher for yourself. I suggest just look at list, create a list. So, q I think is a list q dot insert is the method to use and I would suggest you learn how to use this by yourself. So, one thing please all of you when you are trying to do this, explore by using tab completion and look at the methods of the object. String formatting some of this I had answered when I was answering a question that was posed. Basically, if you have a bunch of variables and you want to do some kind of formatting like this, you want to say x is percentage 2.1 f. I want to print a floating point number with two places and one decimal one point one place after the decimal point, then you say percentage 2. So, this is all very much like C, percentage D is like in C an integer, percentage S I think what it tries to do is it will try to give us some kind of a string representation of Z. So, in this case it happens to be a string. So, it works out perfectly fine. So, to see what this does I would suggest that you try this out yourself and if you are not clear about this percentage D, just note that it is very similar to C style. The only difference is that in C if you are doing a print f, this will be given as a first argument and these will be given as the subsequent arguments to print f. Here on the other hand you have to give it this percentage operator which will essentially feed in these values at the appropriate locations. The other thing again I have explained to you is the difference between print like this without any comma at the end and print with a comma at the end. When you do comma at the end it basically suppresses printing of the new line. But this slide also introduces us to writing Python programs. So far I have only created programs, I have only typed Python code on the interpreter. So, what I will do now is switch to the interpreter. I will quit the interpreter. So, I am now in my home directory. I will now create a little Python file. Please do not use some editor like Word or WordPad. Use a text editor something like Win or Emacs or gedit or something like that. I am going to create this file. Inside this file I am going to create print and just for kicks I will also do this. So, I would suggest you write something similar, but even if you just write what is given on the slides that is fine whatever is given the slides. So, let us look at my version. I have just created A as a string, B as another string and I am just doing A, B on a single line. Just like I would have normally done on a Python interpreter. Then I am doing print this, print A, print B, print A comma, print Word and print this again. What you do is you save this. So, if I cat print, I get this. Now let us run this Python script, Python, Python print example. Notice that it did not print the A and B that I had typed. Notice that nothing above this line has come and also notice that because I said print hello or print A comma, it actually just inserted a space instead of a new line and this you have seen before. I have already shown you this. So, let us move on. How do you get input? Usually it is not a great idea to get input, but if you do need to from the user in this fashion, there are several reasons why it is not a good idea. I would not get into those, but if you do need to do it there is this function called raw input. It is a built in. So, let us look at that. Let us try this raw input. Notice that I do tab completion at various places. Now it is basically waiting for me waiting for a prompt. So, I will type some nonsense. Now let us see what IP is. IP is the string of whatever I type. If you go to the next slide, let us try something else. Notice that raw input can take an optional argument which can be a prompt. Now you notice it is asking me, it is giving me a prompt. So, now let me put 3.14. Now let us just see what C is. Notice that C is a string. It is not a floating point number. So, raw input will always return a string and it is your headache to convert that into some number. How do we convert this to an integer, not to a floating point number? I can say float of C. It will convert it to a floating point number or I can say int of C which will give me an error. It does not know how to convert 3.142. However, can I convert 3? How do I get 3 from C? How do I get 3? C is a string. How do I get C? Int of C which does not work. I want to get just the 3. How do I get just 3? Because 3 is the integer part. I want to convert that. No, cannot I just get the 0? I cannot get the first character. C of 0 will give me the first that is 3 and that is an integer. So, I will repeat raw input will only return a string. It always gives a string and it is your headache to convert. If it is a floating point number, fine. But then if I do float of, this is an error. It will say value error. Invalid literal for float. Just like 3.14 is an invalid literal for an int. So, when you are doing this, you have to be a little careful. But if you know that he is giving an integer, you can always convert it from the string to the integer. So, the next important thing is files. Let us look at files. On the I Python session, if you type PWD, it will let you know your current directory. I Python conveniently interacts with the underlying shell. So, for example, if I did LS, it will list the current directory. If I did bang LS, it will actually execute LS in the shell underneath and print out the commands. For example, I can say bang, whim, print example dot pi and it actually ran whim for me. So, I Python gives you some conveniences and one of those conveniences is CD. So, for those of you who, many of you may not have, they have pendulum. So, if you have pendulum dot text CD to that directory and now let us see how we can open a file. Given a text file, you can create a file object by saying open file name. Now, F is a file object saying open file pendulum, mode is R. So, let us now see how we can read the contents of this file. So, if I want to read the entire contents of the file pendulum, some variable is F dot read and pend is simply a string containing the entire file. So, I can say print pend as the entire file. So, now I have read the contents of this file. So, it is a good idea for me to close the file. So, I can say F dot close. If I print it, if I type F now, it says closed file. Remember, because I have closed the file explicitly. Now, we are left with this little pend that we have, which is the string. I want to now make this into a list of lines. So, I can say pend list is. So, pend is a string. There is a command called split that we have already seen. There is also a command called split lines. It is just a convenience. So, I can say split lines. This will return a list of strings. Now, you see that every line is a separate element of this list. So, if I say length of pend list, this will be right. Let us see if the line is WC minus L, PEN also 90 lines. So, all of the 90 lines have been read into this file. I will summarize all the commands I ran. We open the file. First, we create a file object by saying open first line. Now, pendulum contains all of the data. Now, every line of this file is now available to me as an element of this list. So, now I can do whatever processing I want. However, this is not a common theme is to read the file line by line. And Python provides some very nice conveniences to deal with such kind of things. So, one very simple way to read the file line by line is to simply do for line in open and notice that it has printed the lines of the file. The only thing you have to remember is that it has printed an additional new line after each line. The reason is print is going to print a new line after every quantity you give it and each line has a new line. So, it is printing two new lines. So, what are the key things we have to learn? Let us look at the slides. We did for line in open, pendulum, close, colon, print line. The important thing to realize is when you do open, pendulum, not text, a file object is actually something that is iterable, which means for knows how to look at iterating over something like a file object. So, it kind of behave think of it as behaving like a list, although it is not a list it is a file. As we iterate over it, each line is obtained as line and you can print it and process it. You can do anything you want with the line here. For example, instead of printing all the contents, I could append all of these into a list. So, let us try that. Again I have read the entire contents of this file in one line. It has 90 lines. Let us print the first 5 lines. How do I get the first 5 lines? How do I get the first 5 lines? Lines colon 5. Those are the first 5 lines. Notice that each line has a new line at the end. So, anyway, now we have the list of lines of the file. Now, often you are given a set of files and you want to process these files and the data contained in those files. Typically, each line of that file will contain some records. So, here is a simple problem that we have that we are trying to do some calculations of some data that is available in a file. So, our file contains lines. Every line is of this form. It contains a region code, in this case A. So, it could be A, B, C, D or something like that. It has a roll number. It has a student name. It has some marks of the student. I think first language mark, second language mark, mathematics, science, social studies, total mark, whether the student passed or failed and whether the grade was withheld or not as the last field. The student is absent the mark in a particular paper. The mark is given as A A, A. Now, supposing I want to calculate the mean of the marks, obtained by students in region B. For a reasonable script that I would be expected to write, how do we do this? We look at this systematically. The first thing is, I now have the code to write read files. Given this text file, I can easily read it. Now, I have to process this text file. So, before we do that, we need to understand how to parse say column elements from a line. So, let us imagine that one of our lines looks like this. If I do line dot split, what do I get? I get this line parsed into each word separated by space. What is the line we are given in our file? It is typically A, some roll number, Joseph Raj S, this is some 83, some total 195, fail. So, this is a typical line that we have in our file. Maybe let me see if there is the file is here. We have the file here. So, let us take one line. This is a line, typical line that we have. Oh sorry, this is the file. Let us assume that this is the contents of one line that we have. I now have to process this into fields. I need to know student name, region, code, marks in this, marks in that. All of this I need to find out because all I have is this string. I need to tokenize this into the various columns or fields that we have. So, if I do line dot split now, it is useless because it splits only on space, but I know that the delimiter in my file is a semicolon. So, I say separator is semicolon. Now, I have gotten all that I want. I have the region code as the first element of that list, roll number, name, first language mark, second language mark, I think math, science, social and total. So, my problem is to now compute the mean of the mathematics marks in region B. So, which one is the math mark? It is the third mark. So, 0, 1, 2, 1, 2, 3. So, it is 71 is this mathematics mark. Is this clear? You pay attention little bit here. This is my line, region, roll number, name, first language, second language, mathematics. So, 71 is this math mark. In this list that is index number 0, 1, 2, 3, 4, 5. So, the fifth element in that list is the mathematics mark. And now, from this line of text, I have now been able to identify which mark I can now get that mark. The only problem is that mark is a string, but I have to add them. I cannot add, if I add strings, I will get a longer string. I need to add them as integers or floating point numbers. So, how do we do that next? So, one of the things you need to worry about is, when we split on the semicolon space is not removed. So, if I had a line when the region code had a space like this, my first element would have a space, which is not what I want. So, string elements have a method called strip. So, if I have word is word dot strip will remove all the trailing and leading spaces. It would not remove. So, if I have in this case, if I do word dot strip, it will not remove the space in between a and word. It will remove anything that is leading and anything that is trailing. So, essentially strip will return a new string without all the leading and trailing spaces. So, now we have enough machinery to parse the fields. What we need to do next is, we need to convert our strings to numbers. The math mark is a string. So, if I have this mark str is 1.25, mark is int of mark str, sorry, float of mark str should work. 1.25 int is not going to work. So, you will have to make it into a floating point. In this case, all the marks are integers. So, if you look at this, these are all integers. So, int will work. So, let us look at a quick example. Let us shift to the terminal. If I say mark, mark is equal to int of mark str, type of mark str, type of mark. So, I have converted the string value to an integer value. Now, these integers I can add and sum them up. I divide by the number and I will get the average. So, now we have all the machinery to calculate the marks and find the sum. How will I approach it? First, I need to read the full file. How do I read the file? So, before we get the solution, maybe I will recap whatever we have done so far. The way you open a file is, you simply say open file name, get the data either as a whole, one big shot, you will get one massive string, which you can process or you can say for line in open file, process it line by line. So, if you want to take all the lines and put them into a list, you can say line list is this, for line in open, line list dot append line. Now, we have this parsing problem. We have each line. Given a single line, I can split the lines into various fields. That is what I want. I want to be able to say this is this region, this is the name, this is the fellow, this is so much mark and I want to now get region A, region B may order all the marks. So, I will first find out the fields. How do I get that? I split. The separator is semicolon. So, I do line dot split semicolon. The problem may be that sometimes these fellows may have space. So, I need to remove the space. So, a region code A and A space are different regions or your program will think they are different regions. So, you need to make sure that you strip it because A and space A are different characters. I cannot say they are the same region. So, you use strip to remove that. Then the marks that you have are all strings and you need to convert these strings to integers. So, you do int of that mark string. It will get you that. This is the final solution. So, math marks in B region is an empty list. For line in open SSLC.txt, fields are line dot split semicolon. What are we doing? We are reading every line. Each line we are splitting based on semicolons. Then we find the region code as the first element of that fields of that list. Then we strip that so that we remove any spaces unnecessary spaces. Then we go to find the math mark which is the fifth sixth element. So, fifth index. So, fields of five. Now, this is a string. We convert this to a floating point number. If region code of B is B region code is equal to B. Math B dot append mark. That is it. So, at the end of this for loop ends here, I say math B mean is sum of all the list elements. Sum is a built in like length. Sum is a built in that is available. Divided by length is the mean. So, now let us write this code. So, you must have already downloaded the test files. So, you should have this SSLC one. So, let us write this code math empty list. Notice that the variable names we are using, we have used specifically. So, that it is easy for you to read the code as beginners. Sorry, I was using the wrong file. So, this has how many entries? So, 40,000 entries this has. It is finished processing 40,000 lines and has entered all of the math marks. Now, sum of math B. So, the average math mark was 63.63 in this B region. So, make sure using the right data file. Obviously, if you are using some crazy file, you do not get the same results. Make sure you are also in the same directory. Now, if you have it in some other directory, you can specify the full path to the directory. You can say slash home slash blah blah blah or even a relative path depending on where you currently are. But the point is we have written this code to process some close to 50,000 half a lakh of records have been processed like that on the interpreter. So, it is extremely easy to read files and parse them and process them using elementary python reading files, processing lists, processing strings, converting them to integers and doing some elementary calculations. Now, one question that many people will have is how do you write files? I will just show you very briefly how to create files to write. You can simply create open. I am just creating some junk and instead of giving it without any arguments or with an optional read, you say write. If you put write mode, now this object I can write. The only thing is you have to remember to put new lines for every line and then finally, you close. Now, let us look at the, that is your text file and this is the code to generate the text file. So, you can read from data, write back data files and it is very easy. So, with this we now enter the domain of functions in python. So, we also have to break for t. So, I will stop.