 Do you have any questions? That essentially brings me to a close of what I wanted to talk about over yesterday and today. After this I want you to write some code. So, we would pick up some problems which we will write, but before that some introspections took question asking time. In the morning session in version control, you open a particular file, you delete one line and again write the same line and the same position. Will it be taken as modified or adding or? It will be taken as modified because the programs do not have any intelligence. They will essentially look at files time. Some identically type the same thing, we will test it, we will test it in a slightly different place. So, that is a file A, copy A to B, you are not using the mercurial diff, you are using the Linux command line utility called diff, but they both operate on the same principles. In fact, diff is the basis of version control to take a small historical aside. The whole idea of version control was how do I store multiple versions without losing space? Today we have gigabyte disks, but in the early days of computers, space saving was as important as anything else. So, somebody came up with the idea of I have original file plus the deltas only. I had a 100 line file, so version 2 was line 4 replaced by this, line 7 and 8 deleted. In position 12, these two lines added. If I keep only that, both copies are now occupying much less space. So, essentially that is how the original source code control system SCCS was born. So, it is a series of shell scripts using diff and another thing called patch. Now, let us edit this B, delete the line again, notice, because it is doing a textual comparison. So, probably the diff inside mercurial will do the same thing, because that diff and this diff would be sharing source code is my memory, but test it out at some time. I am leaning to the side, it will be the same and for the nth time, note the absence of any output when the files are same. You ask me, what is the difference? There is no difference. So, I have nothing to say. So, there is no output. Yes, diff would do that. The diff there will do the same thing, because it is like I said, the underlying algorithms are the same diff. So, I am guessing, I am still guessing, not for a day till telling you it will do the same thing. I am still guessing the same algorithm is operation in the HG diff also and it will do the same thing. Let us test it out. I am editing this too, I will delete this line, replace that line. Now, I have to save, let us see, no change, it does not show the, as modified, I deleted two letters typed in the same. So, essentially I have replaced a line by itself, it does not see it as change, because it is using the same diff, there is nothing to commit. Commit means my working copy is different from the last committed one, according to it there is nothing. I just removed the other files, nothing changed, there is nothing to commit. Well, this is interesting, then add much to our handling of versions. It confirms that the people who wrote these systems have applied lot of intelligence, that is all. So, that is the interesting point, it is a good quiz question, good quiz question to ask. Anything else? Rather, next question, there has to be something, anything else encourages us to say no. Here is the version one of the code and if you are committing two, three times and then there is a requirement of going back to the first point. I do not think it is possible, is it possible? You are thinking wrong, you can say full the answer. Selective undo, I mean I know that it is only last commit and delta would be stored, that you say. You can get back a specific version. First version of the code. Oh yes. Then committing up to certain level, two, three levels, so version two, version three, log one. Now. There is no one. Yes, so log is updated. So many are there, right, so many versions. So, I want version one now. So, like is there any upper limit, no limit. Whatever versions are there, anything can be got, that is the whole point. Progressive scan, sorry. Something like progressive scan, keeps on increasing. You probably do not see that because, sorry, we have used the same file and we have made minor changes. The best way to see it is possibly, I have got a file called version. I think I deleted something else in between. I wanted to show one, two, three, four, so that you know which has come because in that file it is not. But it will get you the one you want. All right. Anything else? In any of the Python stuff, version control, before we move on, yeah, we have about 40 minutes. This is a reasonable time for the exercise I have in mind. That is not a test. That is a testing tool. Okay. Unit test, we saw for each module, you write only corresponding unit test module. For GCD, you write a test GCD. So, a large application can have nodules spread in a nice directory structure. I can have a menu, then sales, this, that. So, the code will be distributed in a directory structure. So, you will have for each of them a corresponding unit test. How do you run all of them? You have to go to each directory and run, which is a pain. So, no job is, it will sniff for all these fellows and run all of them. It is simply another convenient tool. Again, nothing much to learn there. It is simply you have to know the syntax. It is an idea is that unit tests are always for a model at a one unit level. But, when you want to do what is called regression testing, you want to, you have made a change, you want to run all the unit tests. So far, we have not seen anything which gives us an easy way. The way we have done so far is you have to go run each unit test module, which is a pain. If you have hundreds of modules, NOS will automate that job for you. That is all. So, it works at one level higher than unit test or dog. The advantage of NOS is it does not care whether you have unit test or dog test or whatever. It will run everything. All right. Next question. Other than the basic image processing things that we have seen yesterday, anything like if you want to process further filtering and all such things. So, that support is available with Python. Among the Python aficionados, there is a phrase for the language. It is called Python is a battery's included language. It is just as I am read function is available with us. Do we have functions to read MP3 and related files by UV? So, if you want to process some elements and look into that, that functionalities are possible. I do not know specifically, but in general the answer is normally yes. You have a large number of libraries for handling different things. Do you have the Python docs online? I mean on your machine. Let us take a look at the Python document, library reference, built-in types, data types. You have a daytime module, calendar. These are built-in modules. Collections, heap queue, bisect, array. Obviously, numpy has even more better arrays. It is the same. Mute xq, weak reference, user list, pretty printing, numeric and mathematical function. These are written by the way. Meaning they come along with Python without going to numpy or scipy. You have a decimal. So, you can do decimal math. You can handle fractions. 2 by 5 and get a fractional number rather than one floating line number. In file and directory access, you have all these modules. So, many different ways of doing data persistence. You can read gzip, tar files, all of that. So, XTR data you can code. You are following cryptographic services. Once again, these are basic. That is all available. You want AES or anything. These are third-party libraries which are available. These are, if you install Python, coming to you by default. So, many OS related, IPC, email, JSON, mail cap, mailbox. So, effectively, you can send an email from within a Python program, SMTP to receive a whole lot of markups. You can generate HTML, XML, parse it. You can read a website like a file. Syntax is almost the same. For line-in, you are a lib2.open, so-and-so.com. It will give you the page line by line. Once again, you rarely write more than a few lines of code. These are the multimedia services. You can see very many different files can be written. I do not know whether MP3 files, I think, are outside this. These are? Yeah. So, others will be available. These are graphical user interfaces, whole sort of development tools. Doctest and unit test, which we just now saw. There are others. There is a debugger. Then there are windows, unique specific services. A whole host of them. Now, if you are looking at something specific, you may want to simply Google for it. Chances are very high. You will find it. For example, you want to do extensive data warehousing type of analysis of data, large scale data analysis. There is a package called orange written in Python, completely open source, huge volumes of data. Similarly, there is a biopython module, which gives you a lot of biochemistry, biophysics analysis built in. You want to do natural language processing. One of the largest tool kits called NLTK is available. You can do significant amount of natural language processing. You want to A tool kits? Yes. Lot of specific things are available. DSLs are available. Almost whatever you want is available. So, I just typed Python MP3 parser. I walked into something called pymedia.org. So, Pymedia library is a Python module for wave MP3, oog, avi, dvx, ding dong, ding dong, allows you to parse, demultiplex, decode and encode. I have no idea what all those things mean. If they mean something to you, you got it. Some of you have expressed an interest in 3D visualization or 3D. Please look at MayaV. MayaV is written by our own program, by the way, which is probably the single largest open source application contributed out of India. It is a huge application. We can probably take a look at, that is MayaV. If you are interested in these sort of things, I suggest you keep in touch for the next PsiPy conference in India. Every year, we have a scientific Python conference, where a large amount of, for example, Berkeley Neurosciences Group has huge amount of Python work they do, and they are a big contributor to PsiPy. And of course, all of it is free. MayaV is free. You can download, install VTK. This is used as VTK, which is fairly standard. As far as visualization goes, VTK is a reasonably standard format. So, MayaV can read and write VTK format. Anything else? All right. So, maybe let us write some code. What shall we write? I will give you a description. If you discuss amicable pairs, the problem is to generate a list of all 5-digit amicable pairs. Two numbers are said to be amicable if the aliquot of the first is equal to the second number and the aliquot of the second number is equal to the first number. What do you mean by an aliquot of a number? It is the sum of the factors of a number. For the purposes of the aliquot definition, a factor is any number smaller than the given number including one that divides the number without a reminder. So, given this way of defining a factor, aliquot of a prime is obviously 1 and aliquot of 2 power n is 2 power n minus 1 because 1 plus 2 plus 4 up to 2 power n minus 1, which is nothing but a power set addition. So, that those two examples are given just to give you an idea of what aliquot is. So, go ahead, create a repository, write some code. Anybody has anything to say, comment on the aliquot problem? Sorry, the amicable pairs. So, that the time it took about 25 seconds, less than 25 seconds and that is the code. If you do not have any questions, we will declare the program understood and stop the session. Anything interesting to note in the program? Any construct we have not seen so far or anything done differently or any particular why is he doing this somewhere? Quite interesting. Quite interesting, yes. You know these days of retail malls, buy one, get one free style. If you find one factor, you should get one factor free. If f is a factor, n by f is obviously another factor so there is no need to find it twice. Another wrinkle, which I do not know whether anybody noticed, you cannot run it all the way equal to n, it would be 2. Your devices will be from 2 to the half of that number. Not half, square root of that number. 2 to square root of that number. Because for every factor below the square root, there is a corresponding factor above the square root. So, if I can find all the factors below the square root, I have automatically found all the factors above. That is where most of the speed up comes. I cannot do equal to here because if it is a perfect square, the square root will get added twice. It is a small bug so you have to be a little careful. Because when you add the factors of a number for a perfect square, you do not say 16 has 1, 4, 4. You say 16 has 2 factors, 1, 4, that is all. 1, 2, 4, 8. You do not say 1, 2, 4, 4, 8. If you wrote equal to, you will end up adding 4 twice. So, that is the only bit of care you have to take. Otherwise the code is fairly straightforward. The other wrinkle is to ensure if a, b is reported, b, a should not be reported. That we ensure by checking. We got a number and we ensure the aliquot of that number is always greater than that number. So, we always report in ascending order. So, automatically the other pair will not be reported. And we are careful to find the aliquot of the other number after checking that so that we save that much of computation also. Every bit helps in order to get that 22 seconds. All right. That brings us to the end of what I wanted to talk about and share with you. Thanks a lot for putting up with so much of Python in so short a time. Do keep in touch. Ask us any time, any questions. If you have on Moodle till the end of the course and on the mailing list or you have asked, you know our mail IDs, I presume. Okay. Thanks a lot. Hope to see you soon on the next workshop.