 So, as I said, the institute is arranging a orientation program on 20th and therefore some of the discussions particularly regarding TA duties in general and so on have been asked to speak there to all the thousand people. So, that portion which ordinarily I would have started with, I am postponing to 20th and today and on Monday we shall have purely technical discussions and orientation in the lab. Our environment is almost entirely unique except for the administrative support staff which likes to use Microsoft Word and Excel and so on. They use it and therefore some of us have to maintain a dual identity. So, I have for example a few machines which run purely Ubuntu or Linux, some machines which run purely Microsoft and some are dual boot machines. But the lab environment that you will be working with and particularly the lab environment which the BTX students will be working with for whom you will be TAs that is almost entirely Linux. So, familiarity with Unix is the first objective. Some of you have worked extensively with Unix might find this a little repetitive but it is worthwhile because what I have understood in last almost 30 years of my brush with Unix is that how so ever extensively you have used Unix there are still n number of things which you still do not know. So, if it is true with me it is most likely true with even everybody. So, which is okay for those I think about 8 or 10 right could you please raise your hands again if you have worked extensively with Unix yeah about 7-8 people which is fine. So, those people might find something repetitive in fact I would urge those of you who know Unix well to sort of help other guys in the lab because others who have not used Unix although the fundamental concepts are same as operating system but the command structure the share scripting practically everything is so different it requires some time to get used to it. So, without further ado we will start our discussion. Incidentally the schedule has been posted to all the senior TAs unfortunately I did not get a print out of the schedule may be I will try to arrange it so that it would be distributed to you physical print out in the lab. But I have today 9-10 is the lecture may be 10-15 or something 10-30 to 12-30 is the first lab for which a handout has been given to you ordinarily with every lab there is a submission. And that submission is a single file which you have to upload on an interface called Moodle on a submission link. This will be true for you when you do your labs this will certainly be true for undergraduate students whom you will be TAs. Unfortunately the Moodle access and all of these uploading etc., etc. could happen only when all students have received LDAP logins and corresponding to those LDAP logins courses have been created. Now I presume all of you have done your registration yesterday evening which means you would have got your LDAP login ID. Unfortunately there is another interface in the institute which has to create a course in Moodle. Today you would perhaps if you go to Moodle did you see Moodle yesterday? Okay fine I will just show the Moodle interface. But today there is no way I can put an assignment which you can automatically download there is no common interface available. So consequently the assignment has been given in printed form. If by afternoon you are not able to create a Moodle course called TA Orientation Program I will mail all of you the assignment which will be a single file which you can copy on to your machines and sort of untar it and do the assignment in the afternoon. So today's schedule is first lecture may be one hour one and half hours then two hours lab then afternoon two to three is another lecture which is on C programming C C plus plus programming and there will be a corresponding programming lab in the afternoon. I can also tell you what will happen on Monday. Monday again we meet at nine o clock so nine to there are I think two lectures again nine to ten additional activities in C programming there will be an assignment. And subsequently we will also ask you to set questions you will give a C programming test by the way on Monday afternoon. So it is a Monday or Tuesday I do not remember I will check the schedule. But as of now after this you go to the lab and the schedule will be available in the lab there. Senior TA's are around here. So the logins any chance of regular logins LDAP logins being used by them we do not know that C C means they said about ten o clock also they will confirm fine. So you might want to go there and cross check. Unix like any other operating system is a comprehensive package of utilities and tools. The basic Unix operating system permits you to execute your programs as processors. So each program that executes inside the Unix operating system or under control of Unix operating system is given a process ID. And when your process is loaded which is loaded by the command that you give for example if you say GCC it is the compiler which has to run. So compiler will get in and that will be given a process ID and so on. If you have compiled a program you want to load an executable load an executable and that executable will be given a process ID. There are a few peculiarities of Unix which you might wish to note. So whenever a process executes inside the Unix environment a certain set of environment variables are allocated to Unix. So this process runs under as far as your program is concerned which is executing the environment consists of certain variable values being made available to the process by default. And certain files being made available to the process by default. The files which are automatically opened by the operating system for any program that runs are typically called STDIN for standard input file, STDOUT for standard output file and STDERR for standard error reporting file. These files you do not have to open they are automatically opened when the your process starts and they are automatically closed by the operating system when your process ends. Ordinarily by default these files are connected to the following devices. Since STDIN is a standard input this will be connected to your keyboard. Consequently whatever you type on the keyboard after the process starts that stream of bytes will go into your process. Depending upon what are your input statements and so on. So if there is a C program that you are executing or C++ program and if you have said C in less less A sorry greater greater A greater greater B. That means you are reading two values for variables A and B then it will expect this input to come from STDIN which is from the keyboard. Similarly any COUT statement or PRINTF statement in your C program will go on to STDOUT which is connected to your monitor. By default STDERR is also connected to your monitor. This is natural so that when your program executes the program collects inputs from your keyboard and produces output on the on the terminal. Unix incidentally does not distinguish between devices, files and directories. For Unix all of them are files. Unix has a file system which we will briefly describe later. But what I am trying to show you here is the kind of environment that is given to any process which runs under a Unix environment. Environment also consists of a variety of what you may call preset environment variables. An important variable for example is called PATH. So PATH usually consists of a series of directories like slash user slash bin whatever whatever whatever. These directories represent the search path that is used by any process which is running in case it encounters a reference to a file which is expected to be found under this file system. So that is the reason why the path must include all possible directories or sub directories where a required component of your process might be located. So typically if you are doing a compilation then all the libraries which are required for linking need to be present need to be accessible and the path will determine that. PATH is just one of the many environment variables. A new environment variable of your choice can be created in Unix by just saying PATH equal to so and so. But when you want to refer to that particular variable you prefix a dollar symbol. A typical way of showing the current value of any environment variable is to say eco. So if you say eco dollar PATH this command will the Unix operating system in response to this command will actually display this entire string which is the value of the path. Otherwise the programs run exactly the same way as you run them in any other operating system environment. It has an extensive file system which is managed on the disk. The Unix file system is extendable that means you can put any number of files, you can create any number of directories, sub directories etc. There are some standard directories with some implications which I will show you here. Essentially the directory and sub directory structure follows exactly the way that you are familiar with. There is a root which is referred to by slash. Those of you who have worked in Microsoft environment will recall that you use a backslash or the other slash whereas this is the standard slash that is used in Unix. So any path, absolute path specification will begin with slash, user slash something whatever. Under this of course you will have sub directories, some of which are standard sub directories which are maintained by the operating system. There could be some user directories which you will yourself be maintaining. So for example this could be my dear and under that you could have any sub directories, any files whatever. As I said as far as Unix is concerned every file is merely a sequence of bytes. A file you may be able to append to a file, you may be able to delete elements in a file. Files may grow by extension, files may grow by insertion of something in between. Unix file system is fairly flexible, it maintains a free block list. It keeps allocating blocks when you want to extend the file. If you delete something it will absorb the blocks that you saw return and it generally manages things. It attempts to keep file blocks in a contiguous way but obviously it cannot do so when files undergo changes. So usually there are back end activities which we call system administration activities which will occasionally compact all free blocks etc. A standard stuff that you would do on any operating system. Any doubts so far? I think this is simple stuff. So what I have just discussed is Unix processes and environment. I will by the way upload this entire set of slides in the moodle. There is a chicken and egg, if there is no moodle how will you refer to this? What happened there is no nagesh? No no no not tomorrow. Tomorrow is too late. They might need it today. So anyway when you go to the lab whatever I have described in terms of the commands on Unix you can actually get them very easily by a command called man which refers to manual. I will show you how that is to be used and when you are in the lab you can refer to those commands. But if you want to make a reference to these slides I will get them Xeroxed and circulate them to you before your lab ends. So that you can carry them home. Just remember to tell my staff there. Hello. The first thing that you do in any operating system is login. As I said your LDAP accounts have been created which will permit you to log into the institute moodle etc. But you cannot log into the lab systems. Essentially we are not using the LDAP login for our laboratories. Where we have separate servers on which accounts are created for our students. Your accounts will be created in about one or two working day style. So before your classes start your accounts will have been created on the institute system. Today unfortunately I am not sure whether those accounts will have been created. So they are arranging for some special user IDs I believe. So we have a category of people who support us technically called system administrators. Some of them are regular staff members in the institute. Many of them are students like you who do that kind of teaching assistantship duty. Who become system administrators. They are called sysads. They are like gods. So they hold control over all systems. How do they be mortals like you and me. We will have to approach them with folded hands saying please create my user account or please give me my login ID. Please reset my password whatever whatever. Despite having the status of gods they are quite amenable. They are very helpful and they are the ones who organize things for us. So as I said your accounts will be created but today at least in the first lab you will possibly be using some dummy accounts. These will be declared by your senior TAs who are helping me conduct this program during the lab session itself. So as I said we are assuming no background in Linux. We are of course for you assuming reasonably good background in operating system and computing environments which will not be so for the undergraduate students whom you will be doing TAship for. Particularly first year undergraduates. There are several who have never seen a computer in their lives. Last year I had 80 of them. Of course the class size was 860. So 80 is still 10% which was surprising because we believe that all students who come to IIT would generally be familiar with computers. There were indeed about 100 or 120 students who are actually done extensive programming in CC++ during their 11th and 12th standard. So you can see as teaching assistants the demands that will be made on you if you ever get associated with the first year BTEC course. On one hand there will be people who absolutely no clue about what computing is. On the other hand there will be people who probably know better programming than we do. So they will have to be challenged with more difficult problems. They will have to be engaged at a different plane and yet you cannot forget those who do not know much. You have to sort of train them properly. That's the problem. In a sense this problem exists even with any batch of any students at any level. So take your own batch. As we saw about 8 or 10 people who have worked extensively on UNIX. It's just that their colleges are UNIX environment. Now UNIX is the mainstay. They will appear to have advantage over all of you. But that advantage should be for some time short-lived. And it should in my opinion not extend beyond one week. That means within one week you should all become familiar with exactly the same thing. Something similar will happen to undergraduate students also. In so one week they may take two weeks or three weeks for which you people as TAs should be sensitive. So we start with the terminal window where you can type commands. These weldings are meant for first year BTEC students not for you. You know exactly what windows are and so on. As far as this keyboard... Oh! I am using this keyboard which is not connected anywhere. Sorry? But then it doesn't work. Spacebar works. Arrow key doesn't work. So basically in UNIX if you want to create any directory at any level. Let's get back to this slide here. If you want to create a directory under mydir here. Then the command that you give... Incidentally the command prompt that you will see will be a dollar symbol. So when you log in you will get a dollar symbol which is the command prompt for UNIX on any terminal. Although there are graphical windows but generally a whole lot of activities in UNIX happen through commands that you give. And therefore scripting and command interaction is a more natural way of handling UNIX. So that is something that you should become familiar with. Microsoft environment typically permits very extensive usage of graphical interfaces. So you can say mouse click here do that etc etc. The Ubuntu environment that you have in the lab will permit you to do so. But as natural UNIX programmers your propensity should be towards going to a command terminal giving commands and doing things. Most of the complex things in UNIX are generally achieved like that. As a matter of fact you take any operating system. If you want to do simple things graphical user interface is okay. Like copying a file from one directory to another etc. But if you want to do something fairly complex where you have to resort to writing scripts in the command language of that operating system there is no choice but to go to a terminal and do the same thing. Whether it is Microsoft, whether it is UNIX or IBM, AIX or whatever. So essentially making a directory means you give this command McDill followed by a name to create a directory. Ordinarily you would go to your home directory when you log in. If you want to change directory to a sub directory so that the commands are executed within the context of that sub directory you have to use a command CD. CD stands for change directory. These are some of the essential UNIX commands that you should be familiar with. LS is a command which generally lists all files present in the working directory. Working directory is current directory at which you have. PWD displays the present directory. CD changes directory. CD.dot goes to the next higher level in the hierarchy tree. So it is backup one directory level. RM is to remove a file. A very powerful command. And there is an option called hyphen r or minus r. Minus r stands for recursively remove files. So if by chance you are in the home directory and if you say RM minus r you can say end of the day go home because there will be absolutely nothing left in the directory. So RM command should be used with great caution. In general a good practice is even before you type RM you must first type PWD to see where you are. So you know that you are in an appropriate sub directory which you wanted to delete. Otherwise this mistake is very very common in the early days. You should be careful about it. CP copies a file. MV moves or renames a file. So the equivalent of a rename command in Microsoft environment would be MV that is move. So the file is not physically moved actually. It would in fact be moved if the file has to go from one physical disk volume to another physical disk volume. But within the same directory structure in any other sub directory if you say move it is like renaming and relocating the indexes which point to that file. As you all know files physically may exist anywhere. But the access to that file is through some pointers which are maintained by the operating system. And that file system says this is the pointers so presently this pointer is here. If you say move the pointer will be shifted to that point. Okay. So new name will be inserted in the directory structure for the name that you have given to the file. But that new name will point still to the same physical file. MKD as you just saw is to make a directory. RMD is to remove a directory. In general you can group Linux basic commands into general purpose utilities. Commands which relate to the file system and there are file handling commands. Compressing and archiving files. You are all familiar with zip. So you have a collection of files, directory, sub-directories, etc. You want to put them into a single container. Okay. So how to get a container like that? So you select a list of files or directories and you say zip them. The equivalent of zip is tar. Tar stands for tape archive. We shall see how that command is used for tearing and untaring. The equivalent of zipping and unzipping. And then of course some simple filters. This is what we are going to discuss. So these are some general purpose utilities. Cal, which gives you a calendar. So there is a Cal January 2009. You will get a January 2009 calendar. Cal August 2010 will give you August 2010 calendar. A useful utility. Date is a utility which will print the date. You can actually print the date in the format where you print date, hour, minutes etc. So current time, etc., is printed. It is fairly simple stuff. I told you about eco. Eco actually is a command which echoes to STD out, which means on to your monitor, anything that you say after eco. So it is like, you know the English for eco, right? You go to mountainous region and you shout. The mountains will return back the shout to you. You call it an eco. So it is exactly that. You say eco this and the monitor echoes it back to you. So if you want to just echo a message on to the terminal, you put this message. Why would you type eco message and then see the message? Because you are seeing it when you are typing it also. Any idea why would such a thing be necessary? Eco message. See common sense, yeah. Over that we shall see later. So when we are talking about printing out values of environment variables and that is a correct answer. So as I said I want to know what is the current path. Then I say eco dollar path. So when I type eco dollar path, I do not know what is the value of the path. In fact eco will find out that value and print it, which is okay. The question I am asking is why would I ever say eco message? Saying eco hello world. It will actually say hello world. But hello world is what I am typing anyway. So why would I do that? Eco hi. Eco hello world. Why would I do that? Because what I am typing only I am going to get out. We will give a condition. If it is true, we will echo a message. And if it is false, we will echo another message. Okay. Okay. So let me experiment that. This explanation is absolutely correct. Ordinarily looked at eco as a single command which somebody is typing and getting result. Makes sense only if I am echoing values of environment variable or something like that. But eco followed by a stream would not make sense ordinarily if I want to execute as a single command. But what happens is each of these single commands that I have listed could be collected together and can be issued as a sequence of commands. Like your instructions in a C program. Your C program or C++ programs are compiled and then executed. A set of commands that are given to Unix like this is called a shell script which is essentially a program written in shell programming language. Now if you write a program like this within which you might do some testing whether this process is running still or not or if that process is there or if that file exists or not. Then using the if-then-else control structures which Unix shell provides you might want to say if that is so then show me error message file does not exist. Otherwise show me an error message I am proceeding further. So there will be an if-then-else statement within which you will use echo this string or echo that string. And that is where echo with a specific string does make sense. So you should in fact become familiar with shell programming very very thoroughly because that is what you will be using for the next 2 years here. You can also use printf you are all familiar with CC++ so printf is okay. There is a calculator. So it is a text based calculator. So you enter an input like this it produces output. Control D is essentially an end of file symbol. You know that all files have an EOF associated in your programs where you open a file you read the file till end of file. So ordinarily this files whenever the file runs the operating systems will send you a special signal called EOF. When STD-IN is connected to your keyboard the Unix operating system will perpetually be looking at the keyboard for input. How do you signal end of file? You signal it by saying control D. When you press control D it signifies end of file on STD-IN. Xcalc is a graphic based calculator. I use the word shell scripting to contrast it with the conventional programming languages where programs are pre-compiled. Scripts are not pre-compiled. However script is also a specific command in Unix. So it is a script demo. Well demo is let us say an executable file say script started, file is demo etc etc. So script done on Saturday 26 September so and so. So basically whatever commands you give script is capable of putting a tag at the beginning and tag at the end. Password changing password. Notice the syntax it is not password it is past WD. So when you say password it means of course your own password. If it is a password username it means that user's password. Of course you cannot arbitrarily go around changing anybody's password. So unless you have the authority to do so. Generally such authority to do anything with any user based with what we call the super user in Unix or the root user. And none of you on the machines that you will be using in this lab would have super user access. You remember the category of gods that I mentioned? Sis admins. So sis has hold all the root passwords. If you make friends with them you might occasionally get that privilege. Who is the command which tells you who are all currently logged in? Incidentally this is how you would see this dollar symbol is the symbol that the operating system gives you as I told you where you write all your commands. However before the dollar symbol you will see generally a host of characters. So these will typically identify your home directory or the user in the home directory and so on telling you where you are currently. So that is an additional input that you can get. If you give a who command it will tell you for example who are the users who are currently running. So for example there may be a guest user who is running on the terminal PTS2. And Mansi who is herself trying to investigate who is running finds out that she is actually logged in three times from these three terminals which is perfectly fine with Unix. It shows the user the device file date type and logged in from there. So it is a very very useful command. There is one of the assignments of the task that I have given you is to write a shell script which will collect all the user data sorted, printed, etc. This is a small small thing. So W gives you more details about the users otherwise exactly the same thing as who. Man is probably the most important command certainly in the initial days of your interaction with Unix but throughout your life of interaction with Unix. Even today I use it extensively because Unix is so rich it is impossible to remember the syntax and semantics of every command and every option. In fact that is true for any complex field. Take any operating system you cannot remember it. So man was introduced in Unix in the very early days. Man stands for manual pages display. Every Unix operating system right from day one from 70s it always comes bundled with manual pages. And if you give a man command followed by any command name then the manual pages for that command are displayed to you. So if you say man ls for example so ls is list directory contents. Name of the command is ls synopsis ls option 5. So this is just a sample but you should actually spend most of your time today in looking at various man options. Because that is the best way of assimilating what all features are available. And there are extensive features there are extensive options for every command. This is just the format in which the man command prints the manual page or displays the manual page. The manual page is displayed one full screen you can go to the next page using page down or space. And if you are done with any manual observation you just type q which is for quit you will get out. The Linux file system internals is a fairly involved topic and out of scope for discussion here. But you would generally be familiar with file systems of various operating systems right. So you know exactly how file systems are organized. Unix file system is based on the concept of inodes which exist in a separate area. There is an inode or information node for every file which means a directory or a device file or whatever. As I said a file is extendable so it is allocated blocks as it is required and deallocated blocks is not required. What I am going to describe here is the more mundane part of the file system which is information about how files are organized typically in Linux. So slash or the root it stands for the topmost root level node. Under that these are some of the standard names. So slash home is the home directory for all users. Slash root is the home directory of previous user. Slash dev is all devices which are accessible as files. So terminal keyword, mouse, whatever you have all external devices, printers. As I said Unix treats all of them as devices. They are called spatial files. Slash mnt used to mount other directories or partition. So typically you take a pen drive like this and you insert it in a USB port. The pen drive contains its own operating system. Its own file system. Now you are pushing this and this has to be made part of the root file system. So in the root there will be some directory, some directory something at which point you might want to connect the root of this file system. So unlike in the Microsoft OS where a new device always goes at the top level along with all other devices. In Unix it is not necessarily so. You can mount a device at any mount point. I am so sorry I forgot to switch this off. Sorry where was I? So as I said you can connect it to any particular point in that tree. So suppose particular point where you want to connect this is let us say my device is slash user slash far pack slash something slash my device. Now my device does not contain any other subdirectories or something. But I want to treat that as a mounting point where I will mount this. So the slash mnt actually lists all the mount points of other directories and partitions. So slash far stands for variable data. Typical examples are mail files, log files etc. Slash usr, usr stands for user. Unfortunately it is a misnomer. Slash usr typically does not contain the subdirectories or actual users of the operating system. It typically contains all the packages which have been installed. So the packages could be a c compiler, a graphic library package. Any other utility package whatever whatever you have all those packages will be installed under slash user directory. Under slash user you saw there are several other directories. So load, dock etc. Slash etc is really an et cetera thing which contains all configuration files. It is worthwhile for you to explore. Just go to slash etc and say ls just see what files are there. Because familiarity is not necessary by reading exact syntax and executing the commands all over. You have to do that for some of the commands which you have to use extensively. But in general it is a good idea to be extremely curious about what is where, what is what. Because even if you see a glimpse of something, somewhere in your mind the human bind will automatically record it somewhere. And then later on when you have to use it you will be able to connect it more rapidly. I have found it very advantageous. You might also want to do that. So these are some of the file system commands. Cat is a command. Cat actually stands for concatenate. So it is used to concatenate or join together two files. But if you give only one file name it will only display. The activity is like that. We concatenate file 1 and file 2. He is supposed to take file 1, take file 2, concatenate them and throw the output of STD out. So STD out is the target of concatenation always. So if you do not give two file names there is only one file name. It will take that file and just put it on through STD out. Effectively it becomes show me that file. So you say cat file name. But the objective of the cat is concatenation. So cat sample file dot text will actually display the contents of this file. CP as I already told you MV, sample dot text, mode dot text. So this will be renamed. Mkdir is to make a directory. Rm mode dot text it will delete this file. Rmdir one it will remove that directory. Rm minus r directory name will recursively delete all files as I told you. I will make these things available but as I said any time you have a problem you just go to man see that thing and just try and execute this. As I said Rm minus r is the only dangerous command. But otherwise if you give any command you make a mistake in syntax. One great advantage of working with computers is that computers are incapable of shouting back at us. Even if we make some mistakes at most it will say I do not understand it that is right. I do not type something. So there is absolutely no issue. And by the way this is something I have noticed generally amongst our student population in the country. We are mortally afraid of making mistakes. And the only thing we are more afraid of is somebody catching our mistake. But let me tell you one great thing about this environment that I learnt is that making mistakes is perfectly okay. In fact only by making mistakes I will learn more. The only thing that I should try and achieve is not to repeat the same mistake again. So learning is greatly advantage if you make mistakes actually. So do not hesitate it. You do not have to purposely make mistakes. It does not add to any learning. But if you accidentally make a mistake you should be absolutely absolutely not worried about. Making mistakes is okay as long as you learn from it. So do not be hesitant as should I type this command. So let me first look at man. Do not do that. If you have some hands type it. As I said the machine will shout back at you. It will give an error. Then maybe think about it. Then look at the man page. Believe me the learning that will happen through this will be stronger. Each file has attributes. So typically a file is traditionally considered to belong to three sets of people. One is the original user himself or herself. I have created a file and the file belongs to me. So I am called the owner. I might be working with a group of people who might want to access that file, edit that file, modify that file. So there is a group of users associated with that file. So consequently the group will have some access rights to that file. And finally there is a general public. Anybody wants to see that file for example. So the public has some rights. So user, group and public are the three categories of users associated with the file. The file access permissions are of three tags. Read, write and execute. So these are called read permission, write permission, execute permission. If I am a creator of the file, I automatically have read, write and execute permissions. In fact execute permission means the file must be an executable. So that I can execute it as a comma. Usually the files that we deal with will not be executable files. So these will be read, write files or read, write and execute files occasionally. To my group, occasionally I definitely want to give read privileges, may not want to give write privileges. It is not uncommon when you are working on a group project for files to have read and write permissions for both the user who created the user owner and the group. Public typically has read access. However you can control this access yourself. So you can actually define for every five individually what would be the access permissions for anyone including yourself. So for example group of three characters, one for one or one for group, one for other. So R, W and X are the read, write, execute permissions. And these are represented by actually binary numbers. So three bit binary number, what is the largest value? Seven. And the smallest value is of course zero. Depending upon where that bit is, one bit reprints read, one bit reprints write, one bit reprints execute. So that bit is all, that permission is available. That bit is all zero, permission is not available. You can now see all possible combinations. So you can actually describe by a number the permissions that you want to give to each individual user, group and the public. But you can also use the key rate as R, W and X. If you show list minus L, that means list giving more information. A file which is shown here, prog.c, it will show you some other details of that file. But more importantly it will give you all the file permissions. So R, W dash, R dash dash, R dash dash. Invariably you will find these are the default permissions associated with any file that you create. No, no, that hyphen is the fellow who prepared this slide. Before the R, W, this hyphen is what my fellow wrote as a hyphen. There is no hyphen there. In the previous slide I saw it is a D written there. No, no, that is a if directory kind of thing. Just go and play and I will find out. But there is no hyphen there. This hyphen is inserted by my colleague who prepared this transparency. There is not a unix word. It is that fellow's word. So, wait a sec. Is there a hyphen at the beginning where there seems to be a hyphen everywhere? I doubt very much. Oh, whether it is a directory or a file. Sorry, sorry. See, I am really getting old. Thanks. So, he was right and you were right to suspect something. That D that was written there. So, if there is a dash that means it is an ordinary file. If there is a D that means it is a directory file. Under which there could be. So, when you say CD, you can say CD to any one of these entries which has a D here. Which means that is a directory. I am terribly sorry. Moved up. But what is important here is from the permission point of view is read, write and no execute. Read, no write, no execute. Read, no write, no execute. So, these are the three permissions allocated to people. You can change these permissions by using a command called change mode. Which is called CHMOD. Unfortunately, unix the original authors. I mean the whole group of Kannigan, Richie and all other fellows. Thamson, who wrote unix operating system in 1972 in Bell Labs. They did not want to type long things I suppose. So, they designed compact commands. Abbreviation in general is a full day of all computerized people. So, CHMOD is the command given. If you want to change the mode for product C, you can say 775. So, 7 as you can see is 111. That means all read, write, execute permission. Next 7 also says the same thing for the group. But 5 is 101. So, you can see appropriately read, write, execute, read, write, execute, read. No write, but execute. So, these permissions are given. You can set these permissions. You can also set the default permissions for yourselves. So, any file that you create, the default would be something like this. As I told you, you can calculate files. And the output goes to STD out. Tag can calculate them in reverse. So, the last line of the file will come last. More or less are paging output. If you say cat, as I told you, cat will display the file on STD out, which means you can see it on the screen. But the file is a thousand line file. The file will vis-pass so quickly that you will see only the last screen file. You would like to sort of scroll things and so on. Without having to go to an editor. So, for that, the commands are more or less. More actually means show more of the file. And less means show less of the file. Both are exactly equivalent. Actually, there are options there, which can show the initial part of the letter, part of the file. But in general, less file name has become a very standard usage. Control A, right? Yes. Now, I got my pointer. It was disappearing. So, less file name has become a very standard way of displaying any text file on to the screen. You can also say cat file name greater than less. But I will discuss that when I show you the redirection. So, it will exactly at the manual page. Whenever you want to go to the next page, you press space, you go to the next page. WC gives you a word count. So, count the number of lines, words and characters. WC sample dot text, it will give you so many lines, so many words, so many characters. WC minus L, WC minus W, WC minus C will give you individual counts if you want. WC will give you total count. This is something that I would like you to be very conscious about. I have seen many people not being able to do that. All of you have compiled C programs, C++ programs, whatever environment, Microsoft or whatever. If I asked you what was the size of the largest executable file you ever created and what was the size of the smallest executable file you ever created? Can you tell me often? In general, how many of you ever see the size of the executable at all? Generally, never. In general, how many times you always see the total number of lines of code that you have written in the source program itself? Sometimes. You all have done your final year project. Most of you would have written some programs for that project. It would be a programming system, several functions, several things. But if somebody asked you how many lines of code has your group written, your answer typically will be about 500, about 1000, about 5000. None of you would be ever able to say 4253 lines of code. Do you agree? I think that is a very bad professional practice. It is important for you to be aware of the sizes. Let me take two minutes off and tell you why it is important. Two reasons. One is, in general as a thumb rule, the larger size of source code represents larger effort and more complexity. It is a thumb rule. It need not always be true. I mean, I can always write 50 lines of actual code and 500 lines of command instead of 550 lines. But including commands and all, generally the larger the five. So, what is the largest programming system that you would have dealt with typically in your undergraduate days? Any numbers? No exact numbers. I am just asking for like thumb rule, 1500. These are called trivial programming systems. Even simple programming system would have 5000 to 10,000 lines of code. Okay. The programming system that the Aeronautical Development Agency lab has developed, which is working on a simulator build-up for last 20 years, the current code is roughly 150,000 lines. You take a banking application, the lines of code, and I am not talking of translated code, source code, would be about a million lines or 2 million lines. So you have to worry about the lines of code. Of course, you should be familiar with this concept and you should generally count lines of code. So, in short, I would like you to do WC or everything that you do. Why is the executable size important? Let us go back to the days when the computers had memory measurable in 64 kilobytes. Have you ever heard of those days? 64 kilobytes is the main memory of a large computer. Okay. If your compiled program got compiled and was of the size 120 kilobytes, the program cannot run. Today, even the most ordinary PC that you sit with or desktop you sit with would typically have 512 megabytes of memory, if not 1 gigabyte or 2 gigabyte. Consequently, the executable file sizes are very low as compared to the physical memory and consequently, we have stopped completely bothering about those. But consider a C program which is written to go into an embedded system. Let us say inside a dishwasher or inside a television. You are familiar with embedded systems? Embedded systems are systems which are actually microcontrollers or whatever and they do specific tasks like sensing something, switching on something, whatever. It is not uncommon that these people have a read-only memory where you store the final code of the size of 32 kilobytes, 64 kilobytes. If you as professional programmers have never been conscious of the size of the executable then you will be extremely bad embedded system programmers. It is not that you cannot write code but it is that because you are not aware you are not conscious of doing something with your program such that the result and code size is compact and therefore it will be very hard for you to write a correct program which will fit into that space. You are all familiar with time complexity of algorithm and you do bother about it. So if a time complexity is order n square you try to make it n log n order log n whatever whatever. But in general, the space complexity has been given a go-by by all professionals. So just as execution time measured by clock is a small representative of time complexity of the underlying algorithm similarly only the executable file size is in some way representative of the space complexity and I would like you to be aware of that. Sorry for the digation but I believe these are important things. We have seen that CMP compares to files. This file comparison is done byte by byte. So you have one sample.txt and sample copy.txt you don't know if somebody has edited the copy. You want to see whether these files are same if they are not what is the difference. So CMP will compare that and it will say defer byte to line-on and it prints location in the first mismatch. There are additional commands they are extremely useful comm gives you what is common. So the first column will give you words only from first file second column will give you words only from second file and third column will give you words common to both files. Where would this comm be used? Have you heard of version control systems? When you are developing large software typically in group but even individually you have come up with let's say one program which you want to make some changes to. That program is working but you want to modify something. The moment you edit it and modify somehow it stops working. Now you want to go back to the original place and you have forgotten what changes you have made. In general how we handle that situation in ordinary fashion where I call it program 1.c and then when I edit it I save it as program 2.c So in about 20 days time I have program 17.c and now I don't even know which program I was doing well. So how did this program differ from each other is extremely difficult for me to keep track of. A version control system permits you to maintain versions of the program automatically. It will give you the latest version any time that you want but it is also capable of giving you any version. Of course if it were to maintain different versions as individual files a lot of this space will be wasted. So what it does unlike you and me who will say program 1.c, program 2.c, program 3.c it will start with the root program it will save it as whatever the base version whenever you edit and you want to tell the version control system save this as next version too it will not save that entire file it will find the difference between this file and the original file and save only the difference. In fact it saves the difference in the form of editing commands such that when those editing commands are applied to the base file the new version is created. These are some of the utilities which permit you to find the differences and explain this to you because eventually or rather earlier rather than later you should learn to use a version control system for any work that you do for any course or any lab it is extremely useful. CVS is what we use mostly here right CVS is loaded onto our machines concurrent version system no so which version control people use CVS only. Finally archiving and extracting using tar is generally used for backup I told you it is like zip so tar actually stands for tape archive it is a very old nomenclature nobody uses tape these days so usually you can write a tar file onto a tape many operational agencies operational organizations actually use this for tape archiving tar command but an output file that can be created which typically has an extension dot tar dot gz you know what gz is? no zip so gz is an extension and you can give either a file or a directory these are various options z will mean create a archive x means extract files from archive f is name of archive unlike zip and unzip you do not have tar and untar the command is tar you can use it to create a tar or you can use it to untar also so minus x will mean extract c will mean create z is to make file dot tar dot gz so this is z option tar is the tar that you use no zip tar dot gz by the way is a zero format which can be unzip or untar by even microsoft environment or any environment so here are some commands basically there is a lab exercise you can actually experiment with that so this is for creating an archive this is for extracting from the archive this is just to view contents of an archive without extracting just want to see what is there in the archive then these are simple filters head and tail so head displays the top of the file default is 10 lines tail displays the last 10 lines you can give option minus n so it will display those many lines you would typically use this if you have large number of data files and you do not remember what that name represents or if you have names with dates and all you want to just see what is there you can use head or tail interesting command it assumes that you have a text file where data is put in multiple columns you will get these kind of things typically as a shared version of a spaceship file for example which has columns or you could also create them yourself but this can actually cut columns so for example if if you say minus c1234 data about text it also depends upon what you have in the file I think when this slide was prepared there was some data file there so you can actually see in the context of that data file only but you can define a delimiter for example the columns in the file may be delimited by space may be delimited by tab may be delimited by comma may be most often used because all of you are familiar with spreadsheets either Microsoft excel or any other spreadsheet so you can always export data from a spreadsheet in a popular format called csv or comma separated values so if the spreadsheet has five columns each way will contain five values a value separated by comma the comma separated value csv type of format file is very often used as input file for a variety of processing I will show you an example in one of the subsequent lectures here where I will introduce a utility called AUK so how do you handle the comma separated file so these utilities like cut will give you this of course the output here does not make sense unless you see the actual data file that is why it will be interesting that whenever I give you the print out of these slides look at the output of cut and create an input file which will create this output that will be an interesting exercise sort is a regular sort command it is part of the unix operating system so when you say it will sort by asky characters first then numerals and then uppercase and then lowercase so usage is just sort file name but you can also define the column number you can also define a delimiter you can actually sort comma delimited data also with this paste the content of the file side by side so it is very very interesting so basically it is not concatenation of files one after another but concatenation lies with the file grep is a very powerful command it research a string or a pattern in a file the pattern is prescribed as a regular expression all of you are familiar with the regular expression not very enthusiastic yes because that is a term which you have done somewhere or the other in your computer science or IT course even for gate you would have studied regular expression so you can specify a regular expression and the grep can search that in the file okay so if this is the file file three dot text it has three lines grep session file three dot text today is linux command basically print out a line which contains that regular expression about scripts now not the script command but scripts as somebody correctly pointed out we would like to execute a whole set of commands which is called shell script it is not uncommon to give an extension dot sh to such a file so my proc dot sh will not be a c program or a c++ or jaya program but it would be a set of shell commands given to do some meaningful activity I have tried to give some small exercise where you could write two or three shell commands like this in a file and then execute that file there are some more commands here we will tell you about the disk space usage ps will give you this snapshot of the current process remember I told you that process is the basic unit which executes in a unix environment and when your program is executed it also executes a process each process is your number and each process will have additional information like how much CPU time it has taken how much this whatever what so once you identify a process number there are other utilities which will tell you more about that process okay so ps is essentially a command which will give you a snapshot of the current process if you say minus u all processes run by that user will be given if you say ps minus f it will list all processes that are working there these are our user and delete user commands again ordinary mortars like you and me will not be able to do that unless we own the computer the semi-guards our friend srisas would be doing this routinely df will tell you the disk usage of the file system so various devices which are mounted at various mount points in the file system it will generally tell you size used amount available amount used percentage and where it is mounted minus h is actually a human readable format otherwise you can have blocks and so on whatever whatever it is a useful idea to occasionally keep saying df minus h and if you see any percentage here shooting beyond 70 or 80 you should be worried about your disk space it is like exactly like properties that you see in a microsoft environment for any drive or any directory for that matter f disk is for creation and manipulation of partition tables we will not go into the details of this you can look at the man thing but f disk minus h is useful it will list the partition tables incidentally you cannot actually execute f disk command to create and manipulate partition tables because that is a privileged access the network configuration of your machine can be quickly looked at by using the if config command so it will for example tell you which are the links which are the addresses broadcast addresses signet addresses mass etc all of your familiar with networks right so you can see these are the details of the network that you can find out okay so to conclude what I have attempted here is to give you an overview of unix operating system how the commands work and some of the important and useful commands the assignment that you have is essentially an indicator so it is not like a lab assignment that you are familiar with where you have to submit something you are now all postgraduate students so one expects that you will work responsibly to maximize your learning that is what is expected from the first lab the second lab you actually have to make a submission that I will announce during the lecture from 2 to 3 where you will be doing some C programming since the model might not have been created by that time I will give you an email ID where you have to send a mail containing exactly one tar file so that will be an assignment submission and that tar file may contain variety of other things the assignment itself will be shipped to you if there is no model in the afternoon will be shipped to your email ID as exactly one tar file so you take that tar file untar that using the tar command in a directory work on the C programs and modify whatever whatever that is the way we will operate is that alright any quick questions because you will require at least 10 minutes to port yourself from point A to point B you are all familiar with the old software lab that is where you went yesterday there are two software labs so OSL is where you will go now any questions observations I have an observation I came 3 minutes late nobody shouted at me 4 5 as a which watch we use this watch okay by the time I was at that door I think it was 3 or 3.5 minutes by the time I reached here it was 4 and by the time I started 5 in fact sorry for that it will not happen again okay bye