 Pertama, mereka akan menulis program, mereka akan melakukan sesuatu dan melakukannya dengan baik. Pertama, mereka akan menulis program untuk bekerja bersama. Dan bagaimana mereka bekerja bersama? Pertama, ia adalah prinsip ketiga yang seperti, mereka menulis program untuk menghantar ekonomi. Jadi program kamu akan menghantar kepada orang-orang yang menggunakan ekonomi. Dan ini basically like the foundation of all the unique tools that we're going to be using today. Jadi, pertama, saya akan memperkenalkan kamu untuk Shell and scripting. Jadi, apa yang adalah Shell? Well, Shell, as you can see, it's an efficient and textual interface to your computer. It actually provides an interactive programming language. You also call this scripting. So, you see like how do you make a script? There are actually many Shells you can choose for. So, by default, I'm just using the default one, which is bash, born again Shell. So, there are also many different kinds of Shells. For example, someone make a Shell based on the C language, then they call it C Shell, CSH. People also make like so-called better Shells like fish or ZSH or KSH. For example, if you're using MacOS, meaning you're not actually using fish, but I'm going to be teaching you more of like the most common ubiquitous kind of Shell, which is bash, like every single unique machine that you have, most likely have bash in it, like almost definitely. So, this is why I would focus on it on this workshop. So, first of all, we have this thing called Shell Prompt. So, as you can see on the right hand side, I opened my terminal, which would run my Shell and I'm greeted with the Shell with this prompt. So, for this one, for my own personal use, I've actually customised it so it looks a bit different. Like you can see it's actually two lines and all. Okay, let me zoom in a bit more. So, it's actually two lines, right? And if I actually go to my Linux instance, you would see that actually it's very bare bone. It's just one line of prompt. And you can actually change this. You can actually read up online how to change this. But the most basic one is that this thing is controlled by something called PS1. So, if I want to, I can always just set it to whatever I want. So, say I want it to be like this sign. So, if you run node, you would know how this works. So, it would just change it to here. And if I want to reset it, I would just reload my... It's just going to take a long time. But yeah, so, that's the prompt. Basically, the prompt is used to... When you want to type things inside, that's what you use the prompt for. And there are actually many different commands. So, the most important one I feel is MAN. MAN is sure for manual. So, basically for all these things, you would see that they're actually like abbreviations because programmers don't like to type long things. They would just type something short and be able to see whatever they're doing. So, there are many different commands that are important besides MAN. So, MAN is used to actually check the manual pages of different commands. So, using MAN, you can actually learn about all the different commands that there is. So, for example, there's CD. So, for example, in this data, you know that there's actually a folder called TMP, right? So, if I want to change the directory that I'm working on right now into that directory, I would call CD for change directory. So, I would just do CD TMP. And now, you can see that I'm actually inside TMP right now. And for CD, there's actually like in Unix environment, there's always two special directories, dot and dot-dot. And they're actually used whenever you want to move up. Like, dot-dot is used when you want to move up and dot is used whenever you want to refer to the current directory. So, if I type change directory to dot, there's the current directory, right? Like dot means the current directory. So, if I press enter, nothing would happen because I'm changing directory to the current directory. But if I say I want to move up, so, right now I'm in data TMP, but I want to move to just data, then I can use CD with double dots. Then I would move up to just data. I want to move up, I can just do another CD dot-dot again. I want to move into data, just CD data. And in case you're not familiar with CD, you can do man-CD. It's actually a built-in. So, built-in is basically commands that are built into the shell itself. So, usually for built-ins that whenever you do man-something and you get into the built-in screen, it's more helpful to run help than the command. So, help-CD. Then it would actually tell you what it does. Change the shell working directory, change the country directory to DIR, the variable, blah-blah-blah. So, you can just read up more about what options you can use. They would give all the options. There's also LS. LS is short for list. So, you would list the current files in directory. So, you can see in this directory I have lock and TMP. And they actually color a bit differently. So, lock is a file and TMP is a directory. This is one of the options that I use, which is to color the output. There's MKD to make directory. So, if you do MKD, like hello, and then I list, now you see like there's hello as the directory as well. Can you all see? So, there's a new directory called hello. And then there's RM to remove files in directory. So, I'm just going to make a file. So, I have a file called A, and I want to remove A. I can just do RMA and it will remove it. If I list the files now, you don't have A anymore. Like, all of this right now looks like very simple. You can also do it with a file explorer or using Finder or whatever file explorer. But, later on you'll find out that why is this so powerful? Because you can actually do more stuff. You can delete different kinds of files based on whatever you type. It's also CP to copy file. So, for example, I have this lock file. I want to copy the lock file into something else called, say, lock.backup, for example. And I press Enter. So, if I list the files again now, I have like two files. One is called lock and one is called lock.backup. If I want to move it, for example, I want to move lock.backup into the hello directory, right? I can just do MV for move. Move the lock.backup to hello. So, now you see that lock.backup is gone. And if I change the directory and I type list, now you see lock.backup is inside. But then if you move up, right? You want to delete this hello directory and just try to RM it. You'd get this error message, right? RM is a directory. Like, hello is a directory so you can't remove it. So, basically, this kind of like a safety feature of some sort. Basically, if you delete a directory, everything inside will be deleted. Usually, that's what you want. You might be mistyping something. So, in order to delete the directory, you have to actually add something else called dash R for recurs. So, just do RM, dash R, hello. And then when you press enter, it will actually delete the directory. So, when you just do RM, you can be sure that it will only delete files. It wouldn't delete directories. Unless you specify directory hello again. So, now we have hello. So, we have this MV for move, right? But do you realize actually when you rename a file, it's kind of like you're moving the file from one name to another name. So, you can rename files by using the MV command. So, if I do MV hello to world, for example, and then I list the files, then you can see that now like hello has changed name into world because that's kind of moving as well. Any question so far? Oh, I'll let you know later on, but basically F means force. So, with dash R, they might ask you questions about are you sure you want to delete this and others, but if you dash F, then it would just forcefully delete everything. You shouldn't need to like, you don't have right access to the desktop. It shouldn't give you the error. Can you copy to dash TMP? Like you copy log to dash TMP, sorry, slash TMP. That's for temporary. Okay. Yeah. I think somehow you don't have right access to your desktop. That's really weird, but yeah. Just use another directory to copy it too. Any other question? Everyone clear? Like don't be shy to ask, just ask away because I know this like, this topic is quite advanced. Like it's quite, it's not easy to grasp at first. Okay, so moving on, like actually Bash has shortcuts based on Emax key binding. So Emax is this editor, but if you don't know Emax it's fine because I'm just like mentioning all the different shortcuts here. It's quite useful because like say you type some really long command, right? Like you have you're typing all the way until here and then like, oh no, I need to go delete the Kaj into like Aaj. Like usually people would click very long. So what you can do is you can actually press control A and it will straight away jump to the beginning of the line. And from here if you want to move forward one word, you can actually okay, if you're using Linux, you can actually press control left and right to do it. Like you don't have to use the alt B and alt F and if you're on Mac I think you can just do alt left and alt right and that should move you like front or back one word. And then this whole word like Kajshdkjh I want to do that whole thing so what I can do is I can actually move one word here and I use control W control W means delete this word like delete the word before the cursor. Okay, so say instead I want to do this one is that so I just move there and I press control W and we just delete that whole thing. I think we just be gone. There's also like other useful things like for example you can delete from the cursor to the start of line which is control U so from here if I press control U it will actually remove everything from that cursor to the beginning of the line and then say I have something in the beginning of the line now I want to insert now delete from there to the end of the line I can use control K instead and we delete it so like this like all the different shortcuts that you can use to like make your life better and shell because you can't just delete things so this is how you actually do it like basically the way shell is built is that your hands should never leave the keyboard because actually moving to use the mouse takes time and you want to minimise there as much as possible and then there are also common control shortcuts so this might not be like something that you would understand immediately but like as we go on you would basically learn more but one very useful one is actually called is control L if you see just now I've been clearing my screen a lot of time so like you see suddenly the screen just goes like that the way to do it is using control L it clears the screen and it will just show you the prompt and for the rest I think I will show you later on like basically there's control C control Z like those things are basically like things to control whatever is happening with whenever you type a command and you run it and you want to do something to the running command those are the things that you would use which one? control S when you have like a really really long output so for example just for example I'm running some some command that would output something very long like just for the sake of it I output this it's very long right and you don't to keep on seeing this like trippy thing so you can press control S it will stop the output you can press control Q and it will continue the output control C to stop so so like you have like you run something suddenly you just like output like a shit ton of things you can actually remove it what was the command to play in this? I just print random stuff cut depth you random oh no no it's in every Unix computer you can use it to generate random bytes yeah there's also zero it would just keep on giving you now characters so it doesn't print anything and if you want to keep on typing yes you can just use yes it will just give you Y continuously control C as here control C terminis the command yes it jumps yeah it jumps because otherwise your shell has to buffer an unlimited amount of text and it can't get rid of yeah okay any other question? okay great so now basically I'm going to go into scripting so like scripting is basically so right now you can already type commands straight away on the shell right like LAS, CD and whatever those are commands but sometimes you want to do this in a file that you can just run so that it will actually do several steps at once so you can open an editor like if you are new to all these shell things I recommend nano for now because it's quite self-explanatory like if you run nano you can see like you want to exit you press control X they will actually tell you all the different things like what to press to get to a certain menu so if you're here like you open your nano right you can type this script so just follow exactly bin SH echo something and then once you're done if you're using nano right you can control X and then same modified buffer answer yes okay press Y right file name to write let's say the name is example script so example dash script and then once you save it you press enter it will quit and it will have save the file so now if you LS right you can see the example script is a new file there and okay everyone follows but right now you can't run the script yet because in unix like for example you encounter just now permission denied you have different permissions you can either read you can either write or you can execute right now this example script you just created you can read you can write but you can execute yes unless you specify that you want to execute this and the way to do it is actually CH mod so you change the mode of the file to be executable that's why you plus X means you want to make this file executable if you want to make it not executable you do minus X and then give the file name and quick tips you can actually type EX and press tap and the shell will completed for you so I do CH mod plus X plus X then you type E even like you can tap then it will show you like it will completed for you then press enter it should really show you anything but now that thing is executable like if you LS and it shows and your shell shows colors it should change the color to show that it's actually executable and the way to run your script is to actually do dot slash okay what does dot mean again current directory so you say that in this current directory I have this command called example script I have this script called example script and I want to run it and if you press enter you should see the word something comes out say again oh you can ask me later I have my dot files as in there's this configuration file that you can change to make it more colorful if your LS doesn't show colors you can do LS dash big G then it should show color okay yeah because I think in Linux by default their LS will only show color but not for Mac so if your Mac doesn't show color you can use LS dash big G it must be big for Unix it's actually case sensitive so like whether you do big G or small G actually matters so everyone can get this something comes out so basically what's happening here is that when you do echo something whatever comes after something is actually will be printed to the screen so if for example you do echo hello wall it will be printed also but you need to be careful because like some characters actually mean something different to the shell so like be careful not to use those characters for example something else like echo hello bang but actually it works fine but in other commands it might not work fine so like be careful with all the punctuations like different punctuations it might mean different things to the shell and then you also notice that there's this first line you have like the hash bang then slash bin slash sh so that thing is actually called the she bang no one knows why it's called she bang like some people say it's because like hash bang slurring it it becomes she bang but basically what that thing is is actually to specify the interpreter so when your shell runs this thing it knows how to run it in this case you use sh for shell so if you're running a python script right you can also do this so for example let's just use test.py I can actually do something similar but instead I would use like user slash bin slash environment python say I just print hello something like this as in like if you don't know python you should know that this thing would just print hello and then I same thing as sh mod plus x test.py and now I can just run it directly straight away so like that thing would just specify what you're supposed to do with it if I change this instead to bin sh it shouldn't work so if I run test.py then it will just say error why? because they actually running it on the shell directly so if I do this on my shell you see you actually got the exact same error syntax error near unexpected token blah blah any question so far and basically using this this script right you can just put like an arbitrary comment inside and your shell would just execute those things inside and then the next one like you've been seeing all this dash business right this command and you have dash G dash S dash R dash F like these things are called flex so flex is actually like parameter so you want to get some like you want to run some programs some options like you don't want to just get the default behavior you want to change some behavior for example RM doesn't delete directories you want to make it delete directories what you use actually this thing called flex and flex is always like prefix with a dash and there's two different kinds of flex there's a short form and there's the long form the short form only has one dash and the long form has double dash and usually like a long form would have an equivalent short form a short form can also be a combination of different long forms basically it's used so much that people actually make it shorter for example you want to get help instead of typing dash H so dash dash help you can just do dash H so it's save time for example like RM I want to get help I can just dash H and it would say or like a very quick overview unlink file but if you want to get more detail information you can always do like man RM then it would tell you like more specifically what it would do and later on you would see also that like okay like I'm using Mac and you see the header it says BSD general commands manual so a Mac is equivalent to a BSD like all the commands actually BSD it might run a bit differently from Linux say in Linux I do the same thing man RM you see that the top thing is just user commands this actually like a Linux main page and you will see that they're actually like slightly different you see RM remove files for directories another one is like RM unlink remove directory entries yes oh yeah RM is a pretty old thing so I think it doesn't take dash H yeah it doesn't like it's because something that's very general like generally most programs take dash H to mean something like if I do Ruby dash H it will actually tell me like all this help thing right if I do Python dash H it would also do the same thing but some of the older programs don't actually like they actually use dash H to mean something else but in general whenever you're in doubt if a new program you don't know how to use it just run it with dash H at the end it will usually tell you how to use the program and if you use short flags like multiple short flags you can actually combine them so like this flags are already short but you can even make them shorter by actually just like smashing them together so instead of doing RM dash R dash F you can do RM dash RF do to our equivalent so you're wondering what does dash R what does dash F mean you can go to RM and you can go down here you see like dash F attempt to remove the files prompting for confirmation so dash F means force means that it wouldn't ask you for confirmation whether to delete these files and dash R remember big I and small R is usually different but in this case they say that big I and small R are the same because the dash small R is equivalent to the dash big R and they are basically attempt to remove the file hierarchy router in each file argument so basically all it's saying is like if you have a directory there's a hierarchy you have like something that's lower so if you use dash R it will actually attempt to remove all these files that's why you need dash R to remove directory so the short flex when you combine them you can actually use different permutation so like RM dash RF and RM dash FR is the same it doesn't matter a double dash is actually used to signify the end of command options so basically sometimes you want to for example you want to create a file to create a file you actually use this command touch so there are many there are many users but one of the use of touch is to actually create a file so change file access blah blah blah but then you see by default if any file does not exist it is created with the default permissions here so that's why like that's that's why we can oh my god that's why we can use touch to create a file but say you want to create a file say like touch dash V or dash F but you see the dash F actually it's actually already a flag that is touch takes in so how do you do it you use this double dash thing which test the program whatever comes after this it's not a flag anymore that's all it's doing so if I do touch okay if I do touch dash F directly you see it would just fail because like dash F is supposed to mean something else but if I do touch dash dash dash F and then now I list the files now you have this file called dash F and it's the same thing if you want to remove a file you cannot just RM dash F it would just like it wouldn't do anything you have to actually RM dash dash dash F so whatever comes after the dash dash is not a flag anymore any of you are you all okay there's also this thing called grub GREP like I won't really go through it right now but basically it's the same thing like if you want to use if you want to use the flags like you want to use something that looks like a flag but you want to actually pass it directly to the program then use dash dash and as I mentioned previously actually like some of the flags actually quite standard so you have like dash A refers to all files so for example if I just LS okay this is not a good one say here I LS right it would show this files but there actually is another file here that actually begins with a period and by default in Unix by convention the dot means it's a hidden file you shouldn't be able to look it up so if I touch dot find me for example so if you create a file called dot find me right but if I LS can you find that file no right so dash A means usually to show everything including the hidden files so if I do LS dash A then you can see all the things beginning with a dot including dot for the current directory right dot dot for the directory above you can see dot gate and of course the one that we made just now which is dot find me and then dash F usually means forcing something RM dash F it wouldn't even ask you for confirmation it would just silently delete the files dash H for most commands would display the help dash V enables verbose output so for example a very easy example is curl so curl is a command to actually download files to actually show files download so if I do curl dash V google dot com they show me this right oops if I don't do dash V they will just show this thing that they get from google dot com but if I add a dash V there it will actually give a more verbose output and they will actually show me like in this case like additional information of what they're doing that kind of thing so most commands would take dash small V to mean be more verbose in the output show more things and there's also another dash V but this time it's a big V that means version so if I do curl dash big V whatever I thought of the worst doesn't matter because dash V just mean look for the version number so I know now that I'm using curl version 7.54 you can do that to bash also oops okay you can't do it to bash apparently bash is quite old but if you do two other things like python dash V python by default shows your version so it doesn't matter but ruby dash V it will show you like my version is 2.6.0 or like node dash V it will show you the version like it's a bit in constant between different programs within big V and small V but if a program has a small V to mean verbose they will use the big V to mean version but the safest will always be to look at a mem it's just like all this dash dash V or like dash H is to look for the it's just a quick way to look for things for help messages any question okay so now that we know about how to make a script I'm gonna talk about shell syntax so like like I said just now that shell is actually a programming language so I'm gonna teach you this programming language so if you already know programming beforehand like you've already use another language beforehand I'm just gonna like try to translate some of those concepts into how you would do things in bash instead so first of all to run a command you just use the command and give the arguments afterwards so for example echo hello it just means like I want to run this called echo and if I ask what does echo do it would actually just write arguments to the standard output like it would just output just print that's what echo is so if I do echo hello then it would just like taking hello is the argument and then like it would just do echo hello echo hello and it would just do hello and there are variables so for now PS1 PS1 actually stores whatever prompt whatever prompt so if I set my PS1 to be this then it would change the prompt the prompt is basically like the part before wherever you type your command it can be multiple lines like for example my prompt for my local computer actually two lines but by default it's just one line it will show like your username your computer name and then afterwards the directory you're located in and then this thing can either be a a a pound a hash or a dollar hash means you're a super user and a dollar means that you're just a normal user so here if I do SU right oops then I would get a pound also and basically the way to do it the way to like store variables right is to actually just to just do like the name of variables say you want to name a variable name for example and you want to fill it with Julius for example I just do this name equals Julius and you must make sure that the equal actually is like right next to the name I don't think you can put space between so if I do this now I already start my name be Julius and to access it you need to put dollar in front so dollar name and I can just do this directly because it would just run a common code Julius because dollar name is now filled with Julius so what I can do is I want to print this name then I would get Julius the neat thing about variables is that like if you just put the variable directly the shell would just interpreted like as that variable being substituted with whatever the content is so for example if I do like name equals LS right and then name is now LS so what happens if I run name directly you run LS so basically like be careful with variable because unlike in other languages you cannot run a variable but in shell you can run a variable but it also means that you can do some neat things that you can put commands into variables and you can just evaluate the variable to run the command any question okay and there are also like some special variables so there's dollar question mark which means you get the exit code of the previous command so if you learn okay, some of you basically in C you at the end of your main function you have to return a number return 1 return 0 that's called a status code so basically programs use this status code to indicate to the caller whether you succeeded or you failed by convention 0 means you are successful and whatever that's not 0 means you failed so let's say I do LS it should be alright right if I equal sorry dollar question mark I should get 0 but if I do something that's not successful so say for example I do like okay let's a non-existent file this is not there isn't file called this right if I enter then they say cannot remove file so if I look at the exit code 1 so basically anything that's not 0 is a failure you also have dollar 1 to dollar 9 which is argument to a script so let's open back your example script just now you have this right so say instead of this you do echo dollar 0 echo dollar 1 echo dollar 2 echo dollar 3 for example so what's dollar 0 according to the slide it's the name of the script itself right and dollar 1 to dollar 9 it's basically the argument to a script so if I I save this right so now what happens what do you think will happen if I just run this what would you print example script with like 3 blank lines because you don't have any arguments but say I do like A, B, C then it would print like example script A, B, C so like dollar 1 to dollar 9 will be filled with all the arguments and the arguments are separated by space there's also dollar number so if I change my example script to instead just print dollar dollar pound then it would print like how many arguments I pass so in this case how many arguments I pass 0 so you should print 0 let's say I do A then you should print 1 I do A, B, C, D, E, F you should print 6 so using this you can know like how many arguments you actually passed to your script so using this for example you want to create a script to delete fast right so like taking an argument and run RM1 for example so you can pass information to your script using arguments and that's basically what your other command does as well this is what we did right any questions so far so don't be shy if you have any questions just ask away so now there's loop so how many of you know what's a loop so what's a loop basically you want to repeat the same thing like several times there's a loop okay how many of you just found out the meaning of loop for the very first time here okay then I guess everyone is quite familiar with the concept of a loop so in a shell you can actually do loops so for example here I want to repeat echo hello 5 times instead of typing echo hello echo hello echo hello for 5 times I just I can do this instead so like 4I in sec 1 5 do echo hello and done and if I press enter it should print hello 5 times so it seems quite magical right now right like oh what's happening so let's unpack this so first of all we have this semicolon thing right this semicolon actually is the same as a new line so if in your script we use new line you can actually just do it in one line and use a semicolon because if you do it in shell directly we type it directly you can't exactly like write a new line to give another command so just use a semicolon and then provide additional things so what a 4 loop do like this the format so you have a 4X in then you give a list then you do give a body and then done so what it does it would split the list assign each thing to X and then they would run the body with X containing whatever like each thing in the list so for example here if I instead of doing echo hello I do echo dollar I because I assign each element of the list to I I should get 1 to 5 because they would assign the content and in this case basically what they would do is they would split by white space so white space means like either a space a new line, a tab like those are white spaces and we will get into it later because it can be a problem and compared to like C or javascript or java you've learned it like these curly braces to indicate beginning and end of block here instead we use do and done so you indicate like you begin with do and you end with done so say instead of doing yes so as EQ is the sequence yes yeah I'll get into it later yep so say instead of doing just printing number you want to also echo hello after the number you can just add more things inside your body and now it would print like 1 hello 2 hello, 3 hello like the body is demarcated by do and done and there's this sequence so like the part that you ask so sequence is actually an external program so you can actually run like man, SEQ you actually print sequence of numbers so the first one is the like where you want to start how many is the increment and then the last number so if you don't provide the last number by default it would be 1 here and if I just run SEQ15 by itself it would actually print this like 1 to 5 directly so if say I want to print 1 to 20 I can just change it to 120 but if I provide like 3 inputs instead then the part in the middle is actually the increment so say I want to I only want like all the odd numbers between 1 and 20 then I can do this and I will only get all the odd numbers because I increase by 2 every time and you notice that this SEQ15 is actually surrounded by this dollar parenthesis and this thing would actually substitute the content of whatever you print as output to the program so like if you notice I do SEQ15 I get 1 to 5 right so what it's doing is actually it would fit this in into the program like it would just replace whatever is inside the dollar parenthesis with the output of this thing so if I do like when I'm doing this right dollar SEQ15 you know the output is 1, 2, 3, 4, 5 it's actually equivalent we're just replacing this 1, 2, 3, 4, 5 like both are equivalent because you just replace the output of SEQ15 into the program itself so if I do this you should get the same thing also and echo hello it just means like you would echo hello and they would actually run commands like everything in the shell script is a command basically including like basically as long as you don't have all these special special words like for later on you would learn if the rest are all commands that they are always regarded as a command and the way they actually search for commands is they will actually use this path it's not too important you can read about it but it's dollar path thing if you want to any question so far anything but this list it doesn't only take same number you can also do like like QWE ASD ZXE and then when you do it it will also do whatever is inside the list so it can be letters it can be numbers it doesn't really matter so the next thing that I'm going to go through is actually conditionals conditionals is basically to like branch out right? you check whether something is true or not if it's true you do something if it's not you do something else so so let's do this then if test dash D slash bin then echo true else echo false if I and you should get true so you are kind of confused what's going on here like it just prints true or like if I change bin into something else like ASD then it will print false so you should kind of know like the get a big picture so basically like you have this if statement so an if would take in a condition and a body at the very least so in this case the condition is test dash D slash ASD so what it would do it would actually run that command see what's the exit code if it's zero it means success then it's regarded as the true so if it's true it would run the body otherwise it wouldn't do anything or if you have an else block like here we have an else block else echo false then it would run it but say I don't have the echo false I don't have the else block right I just do this then it would just not do anything so it can actually check whether it's true so like if I do test dash D slash bin right and then I echo the exit code it should get test the exit code it should get zero but say I test for some non-existent directory like there isn't any slash ASD we don't create it right and then you echo it's actually one you see it's actually not fail so you can use this test test to like check for things so in this case we're using test dash D so what dash D does is actually checking whether the directory exists so if you want to see more details you can always do man test so they would tell you different things here so if you're looking for dash D you see file exists and it's a directory so that's what we're doing just now we want to check whether like this thing exists and it's a directory so if I do instead of instead of dash D I do a dash F right F means that the file exit is a regular file obviously dash bin exists but it's not a file so if I instead do test dash dash F slash bin I should get one because it does exit so basically using this you can actually test for many different things and you can actually perform actions according to whatever is the result and because this thing is so often right and writing test is a bit longer than just writing square brackets that you can actually replace it with a square bracket so just now this thing you can actually write it instead as this and it's still the same thing so if I go false here so if you actually run man this thing this open square bracket it will actually tell you the two are the same test and the open square bracket so you can either run test with the expression or you can just put the expression in between the square brackets any question so far and basically in this thing you can actually have so many different things all the checking with checking about files checking the length of string you can check for other things you can even do like and and or like you can do whether file one is newer than file two you can do a lot of different things here and using scripts you can actually automate some stuff for example like if the file is newer then you don't remove it or like you remove the older files that kind of things so now combine everything that we learnt together so we want to create a command like ls that only prints directories so right now if you do ls it will print everything every single thing that there is like I have my files example script pass.py and I have my directories KMP and world but let's say I want to create a script that only prints directories so let's do the same thing open your editor and type this thing so the shebang for f in ls do if test df then echo it's a directory f fee done let's save this as lsdir for example and don't forget to like chmod plus exit chmod plus x lsdir okay so for writing these files you might be familiar with the so-called egyptian bracket so like when you type c you can actually you can actually just do something different okay so when you write c for example you can do like other this right or you can put this brace on a new line so it's the same thing so here you can either do this but some people don't really like it because they want to actually just do this instead like the start is up to you you can also do this like this thing works the same it's just basically like putting the do on the same line as the for and then the same line as the if and then if you if you run your lsdir right you would see that actually now it only prints like the directories you have directory tmp and you have directory world but now there's a problem let's say you have a directory called myDocuments so let's do that just to show you here so like I create a new folder called myDocuments so now if I if I list it you see there's a directory called myDocuments so what happens if you run lsdir is myDocuments printed what do you think is happening let's try running it locally so if you do 4f in ls let's just equal f and do this everyone understand what's going on here so if I print it oh no myDocuments is split it's my and documents as separate things so basically this is what I said just now about batch splitting by white space so white space is like all space so including whenever the file name has a space in it it will actually split it separately and this is actually not what we want myDocuments will appear as one so batch has a solution for this okay basically this is what I'm talking about so like right now it will just test on my test on documents they don't exist they are not a directory so it wouldn't print it it would just dir TMP and dir world there's no dir my cause my doesn't exist by its own and document also doesn't exist by its own and basically in batch right if you want to actually send it a string that contains a space you have to quote it as you have to quote it with a double quote so for example if right now I do this right my documents you see that they are separate but I say I actually wanted to just be its own like one thing what I can do is I can actually put quotation mark on it and it will actually treated as just one thing the whole my space document is one string that is passed on to F and then printed so what do you think if instead of doing this I put a quotation mark around it ya because when you run when like this thing right we just take everything here and then you do this right it will just take everything like the whole thing with even with spaces and they are just taken like as a string as a very long string they are separated by spaces straight away so this wouldn't work but the solution is this thing called globbing so basically in bash whenever you want to look for files like you want to pass argument you want to run commands and you want to pass file names as the argument you don't have to type each of them but you can actually automate this looking for files using patterns so this pattern is called globbing so you have the asterisk you have the question mark and everything so like asterisk means any string of characters so anything goes it can be nothing also so like for example if I okay so here we have lock right so if I do LS LO asterisk G but bash would actually replace this asterisk with anything it can be nothing it can be some other things so if I do this it would just do lock but if I say I have another file called LOAG if I do this then it would list both LOAG and LOG but if I do and the alternative is actually a question mark a question mark would only fit one character only it has to be one it cannot be more it cannot be less so if I do lock it will only give me LOAG and not LOG okay this kind of like it can be confusing so like anyone here kind of confused with what's going on here with the patterns say again yes so say I want to look for anything that begins with my okay wait am I in the right one? yeah my should work oh okay LS doesn't okay by default when you do LS something right it doesn't okay if I do echo this it should print this so if I do echo my yeah so it prints that so if I do so the thing is right if I just do if I just do a star just a star it will match with everything correct cause a star would match anything from nothing to like any other thing basically anything goes if I just put a star yes so if I do echo this right so what happens if I do for I in this do echo I done yeah it will basically like instead of separating everything inside the LS by spaces it will actually be bash itself that will actually like look for the files and pass it to I before then running the body or like one alternative is you can also do like look for any of this character so L and I just want like okay let's have another file called like so now if I do LS L G there's like and lock say I want to specifically look for like and lock so let's create another file called leak and then I want to find just like and lock so I can do like L and then A, O, G so this means look for either like or lock and I do this that it would do this so we can replace instead of doing for F in LS in our LS there right instead of doing for F in LS we can replace this with a star and then if we run it again oops and then we need to put this cause cause now it got space so if you don't do that then it would appear as if like you're trying to pass to okay if I run this dash dash D my documents they would actually run this right if I do that then it would be confused dash D it should be checking for only like one argument why are you sending me two arguments but if I do this instead like with a quote then you're actually actually sending my documents as one argument to the test command and it would actually like do the proper do the correct thing so two things I need to correct I need to change the LS into a star and then I need to quote this here so after you do that if you run LS there then you should get the correct thing now you get deer in my documents deer TMP deer world everyone okay ya you can also make advance patterns so what do you think this means for F in A star everything starting with A how about this one for F in full slash star.txt anyone text fast yes so any files that ends in dot txt and is located inside the full directory how about this one actually that is inside the directory full and it's like 3 letters long right but starts with p so p plus 2 additional characters ends with dot txt and this like you have this this thing right that means that it can be any subdirectories of full any characters yes anything anything at all so if I go above here right I do for F in I just want data slash just slash like that for example then I echo and you see I'm actually getting all the files that are located inside data oops ya here you see like it's basically anything inside data I want to get anything inside ya so all 3 letter text files starting with p in subdirectories of full any question ok there are still possible issues here so if I do this right if I check whether dollar full is equals to bar what's the issue do you see the issue no no equal is fine because if you man test there's actually like an equal S1 equals S2 so it's fine but what could go wrong here remember your white space what would happen if full contains the space ya so say hello world and then I check full equals bar they would say too many arguments why because the food actually be explain to hello world so not instead of having just like the argument is like content of full equals and bar it actually have hello world equal and bar so one possible solution is you can just like quote the full so if you quote the full like that right that issue work ok this is fine right what happens if full is empty instead so how do you think this thing would look like if it translate to bash straight away it would look like this correct what do you think would happen if you run this it would just error out again right because it expects to get 3 arguments basically getting 2 so there are a lot of issues with this la like ya arguments to the to the test would just be like equals and bar they are work around around this so like ok if you run this if you ok let's change to some other things there are possible around so like what you can do is you can put like an x in front of this so what will happen instead is like whenever it's empty it wouldn't be empty because you always like prefix it with x on both sides but that's not really a good solution because it's very hacky right like people reading your script will just see and like what is this x what does this x exist right so what people would usually do is actually use a built in bash test which is instead of 1 square bracket so if you do help double square bracket then it will tell you it will actually execute the conditional command it's pretty similar to test except that you can actually now do like the proper proper operator like this and and this or or a not whereas like if you actually use the normal test right you can't actually do this like you have to use dash o dash a and stuff like you have to use dash a and dash o which doesn't look as nice compared to n and o n and like or so in order to prevent this actually there's a very good tool to check for this kind of possible bugs like it's called shellcheck.net so if you open it right you can like type you can actually just copy paste your script here and then like it would give you the output so for example we do whatever we did in the beginning right so for f in ls do if d f then echo dia f fee done okay they see like they give all these thing you need to double quote and this so they actually do you can apply fixes it will automatically fix it for you you see like it will auto quote and everything but it will still like tell you this you shouldn't iterate over ls output you should use glob instead which is what we said just now right so this one they can't automatically fix it because they don't know what you want so say I changes with this glob then okay there's no shebang so let's add a shebang yep no issue detected so like this is a good way to check where like a good way to a good quick way to check whether the script you wrote is good or not you can try using this so like someone actually wrote this script and if you're writing a script you should use this to prevent unwanted issues like white space splitting and what not okay so that's the first part I'm gonna give you like a 10 minutes break and after this we're gonna go to data wrangling if you have any question just we actually have several helpers here so if you have any question you can just ask oh no I'm using item it doesn't matter it's about the same actually okay it's slightly faster but it doesn't matter like faster in printing stuffs but if you're using for very basic stuffs then it doesn't really matter actually but you can try item it's quite similar like the learning curve it's not too bad if you try to use item oh my linux is not on my computer it's actually like I have a server somewhere else yes not sunfire it's my own digital ocean if you have student account you can use github education pack then you can get a voucher for $50 I think for digital ocean I don't have to pay yet you get $50 for free just go to just search online github education pack you should grab it there's a lot of free stuffs free nice stuffs in it but I'm using the machine that's $5 per month US dollars it's the same as single single thing except for this one bash will actually take care of you of a lot of all this annoying things like when your available is empty that kind of thing so if you use it's the same as test check whether it's a directory check whether it exists and it's a directory you can do man test then you can search for dash D of course if I don't use dash D it will forever return me zero which is like cool if there's something that doesn't even exist I press test it wouldn't do anything it's nice for quick and dirty scripts you just want to do this very quickly typically whenever you start using scripts if your script is longer then like I don't know 50 lines usually it's better to use other tools to do it not really faster in that sense it's faster for you to type because you can it's fine it's going to be slower than other things for example if you do like string processing with bash it's going to be slower than if you use c or something but usually it's good enough for our purposes it's interpretable it's like javascript you don't compile you just run like python you don't compile just run I use brew home brew not beam that's not beam I don't want to use which one? I use beam as an editor then how do you because I saw you edit a python with a shebang shebang I use beam I only use nano in the beginning I use beam afterwards you also use beam and you put a python shebang run yes you just change the shebang so I can bring it up it's an alternative syntax so you can either test something or you can wrap the thing inside a square bracket it looks nicer that's it it's the same as test exactly the same as test it's an integrated shell feature it's an alternative syntax you can type that because it looks nicer so when you put square brackets it's understood to everyone that it's the same as test that's why if you actually go man this they would say that you can either use test you can either use test or you can do this thing the two are equivalent the not equivalent one is if you use double square bracket it's just like you do instead of doing that you do this thing so this thing it's not an external it would actually be bash that evaluates it and this thing is nicer because if you use variables inside it would actually be safe yeah so in a sense in a sense the single square bracket is like a test function it's the same as it's the same exactly the same as test the double square bracket is like just a bracket or something to demarkate yeah it just looks nicer because if you do test then it looks asymmetrical just test something but if you do this and you know whatever is inside must be something you run in test so in mesh critique there is quite space and indentation matter no it's not python that's why you have the do done and stuff that's why python they only do and done because you demarkate using indentation it's meant to be python then they will just pass this whole thing to python you don't think it's vim you can use it doesn't matter you use nano obim or whatever because it's just a text editor yeah it's just a text editor it just what matters is your shebang if your shebang is but shebang doesn't I've never used shebang before yeah shebang just means if you make the file executable then you need to know how to interpret this so basically yeah I'm saying that this thing is please pass this to python if I just been ssh I'm saying please pass this to shell to run that means I don't have to compile this right you don't compile python yeah you don't compile python so I I don't so it's usr is user yeah user so I just like run then I can chmod what's chmod again change mod you want to make it executable plus x yeah plus x and give your file name and you can dot slash that thing also I don't have that what do you type I think you have a slash when you shouldn't have a slash there isn't a slash after enb then just run it okay okay how do you compile it how do you compile it how do you compile it come find me later ya I can find me later okay so let's continue again so the next part is composability like this is the part where we actually like where the unix philosophy number 2 comes in so if you remember the unix philosophy just now number 2 says you write programs to work together and this is where like shell really shines in as compared to other like other languages la so they have this thing called composability so what it means is like you run something get the output of one program and use it to fit in the next one and you keep on doing that so you can chain multiple programs together so like each program can be a small program they just do one thing but they do one thing well after they do that thing you can pass it on to the next program who actually do another thing to the data that you finished using so for example let's do this so demessage which basically means like you want to find the kernel logs to get like the system messages and then you use tail so demessage use that thing it's called a pipe and then tail then you enter then it should show you the last 10 lines of of the kernel if you're using if you're using sorry if you're using Mac instead you might need to sudo it so you need to like sudo demessage tail so the same thing and you would get the last 10 lines basically so what it means right whenever you have this A pipe B it means that like run A get the output of A and then feed it as an input of B and print the output of B and you can actually like chain this even further basically like you can just keep on adding pipe A pipe B pipe C pipe D you can keep on doing that and that is what it's called composability like the ability for you to compose using smaller smaller commands to actually do something useful so you can make it even longer so as an example let's do this so like this one is a system lock so you cat var lock says star lock and then GREP GREP means grub which means like you want to match based on patterns so in this case we're just looking for March 23 that's why you do like grub or I think you need a you need a quote here by the way grub March 23 and then you do tail and you will only see like the last 10 messages that whose date is March 23 so like if I don't do tail right it would get this if I do de-message by itself I actually get this all is a very long thing right so I get all this long thing like if I do a GREP March 23 then it will just show me like all those that happens on March 23 but only one the last 10 lines say I can do tail you see like how I'm composing like I'm using this to compose like something useful out of little commands all cat does is read a file and print the content all grub does is like it would read whatever is read into it and just print out the lines that matches whatever you ask it to find and tail will just print the last 10 lines of whatever is being read into it and all this work together they actually give you what you want anything inside this lock whose that contains March 23 and the last 10 lines only any question so far okay and the concept here actually based on something called streams in Unix so all programs in Unix whenever you launch them they will open these three things called streams so there's STD in there's standard in standard out and standard error so what are those so standard in is basically just usually it's your keyboard it's like the input of each program whatever they would take there's also standard out so the program would usually just print the standard out there's basically your shell like your monitor and there's also standard error it's a second output the program can use error messages because sometimes you don't want to see error messages then you can just filter out standard error so you won't see it but by default both standard out and standard error is your terminal whenever you run your terminal they will just output there and standard in is your keyboard so whenever you run something it will just take in from there so what you do when you pipe it's actually redirecting these streams so when you run A by right standard in keyboard or whatever it is and it would output to the screen because there's standard out but what happens now is that you take in the standard out of A plug it to the standard in of B so for B instead of taking directly from a keyboard it would take its input from the output of A from the standard out of A so you just plug it take the A's output and put it into B's input but you keep B's standard out to your screen so you can still see everything so that's what's happening here when I run all these commands like cat doesn't take anything for standard input it just runs but it outputs to standard output but now the standard output is taken and fed into the standard input of GREP so I can show you something so if I do just grub MAR-23 so what happens if I do something like that it wouldn't do anything if I do MAR-23 blah blah blah you see it actually outputs that MAR-23 again can you see it? so if I just type in something that doesn't contain MAR-23 it doesn't do anything but once I type in something that contains MAR-23 it prints that to standard output so if I do something MAR-23 something it will print that but if I do something type something that doesn't MAR-23 it doesn't print it back so that's all it's doing basically GREP just takes in the standard input check against the pattern if it matches they will just output it and this thing is quite useful for other things so for example you can actually redirect the standard output to a file so when you run this right here this thing that you output to the screen you can actually redirect the output stream to the screen to go to a file so instead of doing this output to a file say you want to name it syslog for example when you enter nothing is printed to your screen because now you redirect the standard output of tail into a file syslog so instead if you look into syslog it will actually contain whatever the output was just now and then it's the same thing for you can do like 2 larger than sign that means redirect to a redirect the standard error to a file you can also do like from standard input goes to the file another file so for example instead of doing cat right because cat just takes a file and print its output correct you can actually do like grub march 23 and then you redirect its input from a file which is this file right and then you can tail it and you can still get the same output because for a smaller than sign like basically it would just take a file and use that file as a standard input instead of your keyboard and you can also do A and then like you take in some text so for example you want to grub let's use this and it contains march 23 right okay let's say you want to search for march 23 from some text but you want to type the text directly then you can just like put the text here so say this thing actually contains march 23 then it will print it but say it doesn't then it wouldn't print anything like all this triple input redirect do is actually it would make the standard in whatever comes after the triple redirect any question okay any like if any of you are lost you can just raise your hand and ask me to explain again because this is like a different concept from what you usually have so you might ask yourself like what is this useful the whole reason is because it lets you manipulate the output of your program so like for example instead of just using whatever the content of the file is you can look for certain patterns you can like say like oh I only want the last 10 I only want the first 10 so like last 10 you can use tail if you want the first 10 you can use head for example like there are many all this like small little commands that does one thing well that you can use to a larger like a larger functionality basically so for example I want to list files and only get those whose name is foo I can do ls pipe it to grub and find for something like for example in this case I don't have anything called foo right but say I want to find for pdf files then I can do that oh look there's a file called hacker 2.0 pdf if I am inside my data directory right I have all this log file so I want to search for anything that contains a g inside I can just do grub grub g and it will give me all the files whose name contains g if you use ps so all ps does is list all processes that is running on your system so if you do ps it will show you all the processes say I want to search for bash I can do ps pipe it to grub bash and it will show me all the bash that is running in the process so you can use the output of one fit it as an input to another program that's what the pipe does if you are using Linux for example you have this thing called journal CTL so all it does it prints the sys log right actually you can achieve the same thing just doing cat var log sys slash log it's the same thing so for all of this I only want to see anything by intel anything by intel and it must be case insensitive so it can be small intel big intel in this case I don't see any because I don't know maybe my server doesn't run on intel but let's say I want to look for something else for example I want to look for kernel I can do this and it will show me everything by kernel and say I only want the first five then I can do head and this head command actually it takes in dash end flag which is count how many do you want so head display the first lines of a file so this is what I want say I want to see the first five lines I can do head dash end five so all it does it will print like cat will actually get a system block you pass it using pipe to grub to find only for lines that has kernel inside and then you pipe it again to head because you only want to see the first five matches and if I do this it only shows the first five see I have one two three four and five and this stream actually follows actually forms the basis for your data wrangling so it's going to be covered after this any question so far about stream redirection and pipes you can try on your own terminal as well and make sure that you understand what they do and okay this part is about grouping commands so basically what you do with grouping commands you can group them together group the output of them together before you pipe them to another thing so for example I can do echo A echo B echo A again and I can do this what do you think will be what do you think you will get first of all if I just do this what do you think will come out what will I see ABA ABA so if I group this in a parenthesis and then I group A what do you think I will see that all the lines containing A right so you can actually just group commands group all the outputs together before you pipe them which can be quite useful so for example there's this command called tag so if you look at the main page it basically print file in reverse so if I echo QW echo ASD echo ZXC pass it to the tag what do you think you will see you just see everything in reverse as compared to if you just do it directly then it will be like QW ASD ZXC this might be useful sometimes if you like print log files and they are sorted from like earlier time to later times and then you want to see it from later to earlier then you can just pass it to TAC so like if I do sudo okay let's okay let's look at the log and just look at the last five so I have this right and if you see the time is actually increasing like 655 656 656 65623 and so on and so forth right if I pass it to if I pass this to TAC right I can see it in increasing order see like 5624 323 instead of like chronological it's like reverse chronological they should CAD VA LOCK SIS LOCK TAIL AND FIVE TAC Which one? Do you install the Xcode command line? You doesn't have TAC Oh ya it's Ya Ya ya If you're using OX Ya it's Ya Ya I think it's only on Linux and Ya you can brew install it Core utils ya So you can brew install core utils if you're using brew By default they should install it I think in brew or not I don't know but ya if you're using Mac then you might not have this unless you specifically install it basically but the the concept still stands lah like you can you can pass it to grub you can pass it to anything there's also something called process substitution so this one is actually a bit different so you might look at this like this sign here and you might think that oh actually like I think it's like it's an input redirection korak you change the standard input of a command but it doesn't do that exactly what it does is it would actually like run whatever command that is inside the parenthesis and after you're done you would actually like output this to a temporary file and you pass on the file name so to demonstrate if I do this right echo A what comes actually like a file name a temporary file name somewhere you don't actually get like the echo A but say I do cat what does cat do again it prints files right so if I cat this I actually get A so the file does indeed contain whatever I run inside so if I okay this one you can't really do on on Mac but if you use Linux you can actually do this okay as in so there's this command so like if I do echo A echo B echo A echo A so one file will contain just A and B and one file will contain A and A so if you have these two files you can actually look for the difference between the two files using this command called diff so let's just say I do this I can actually see that look like at the second line like 2C2 means like second line, second column you have this difference like one is A, one is B so what like how is this useful so like I can actually do this so like I can diff journal CTL dash B dash 1 hit N20 so all this all this mean like the journal CTL dash B dash 1 it means I want to see the boot lock dash 1 means like the last like the previous boot lock 1 lock ago and I want to I want to differentiate it from the one that is 2 boot locks that is 2 boots ago and I also want to compare just the first 20 lines then it will actually like give me all this like what's the difference in the 2 file side it can be useful this way whenever you want to run these 2 commands whose result are kind of similar and you want to see what's the difference so the most useful one is for example you're doing your programming your programming task and then you have some test cases you can do that so you can just run it using a process substitution the other one you can just pass in the file straight away and you can see whether the output is like same or different just using process substitution there's job and process control but I think I'll skip that one like you can read it if you want but I think for whatever we're doing right now like it's not really relevant for what we're doing so I'm going to get into data wrangling which is like building straight away from the composability that we use just now basically it's like whenever you have a bunch of text and you want to do something with it you want to get something useful out of your data because sometimes you have a very long data for example in your logs that you have you have this log file you know this actually it's a lot of data actually there's 9,000 lines inside imagine if you have to imagine if there's you have some error happening somewhere and you have to look through these 9,000 lines it's not really nice it will take a long time so what you can do is you can actually use the shell to help you use all these little tools to find out what you exactly want basically so data wrangling all you do is actually you convert from one form convert from one format to another format and convert to another format again all the way until you get what you want exactly so like this what we did just now like journal CTL grub-i intel so like it basically it's a very basic example of data wrangling like you take the you take one thing that outputs a text and then you want to do something with the text in this case we're just looking for lines that begins with intel but actually like there are more things that you can do with this so let's start from the very beginning like we basically need a data source and we need something to do with it so data source is usually the first thing on a pipe and what you want to do with it is basically what comes all the way after like the pipes all the following pipes and a good use case for this is locks so like that's why I send you the locks so I'm sure all of you can read the locks like you've got it extracted and inside your folder so make sure that in your terminal you change directory to where you store the locks because we're going to do like some things with the locks so like here on the screen like basically you have the locks here and as I showed you like actually the last line is actually like 9000 lines you're not going to be able to read all this in like 5 minutes or something right like it's a lot of things so say I want to figure out who's trying to login into my server so if you notice you will have this so like so for example you have this thing like accepted public key from curing blah blah blah so basically curing is trying to login into my server and yeah this is also Julius I think Julius trying to login so like you can feel like this different people who's trying to login and what I want right now so if I if I just do cat lock there's a lot of things right like you're not going to be able to read all this it's too much things so let's try to grip it some more you notice that actually here whenever there's accepted public key there's SSHD, correct basically SSHD is what's responsible for people login into the server so you might want to just see like messages from SSHD so how do you do that what do I pipe this to anyone what's the command to filter out lines grab and I want SSHD do I want it to be case insensitif yes no no cause SSHD is all like all lower case and I want it to be exactly that so if I do that oh look it's way less right imagine if you look at this and you look at this you can tell that it's definitely less but it's still a lot of data you can see it's actually still a lot you're not going to be able to read through all this and one way to actually see how many lines you have it's actually there's this command WC word count actually but you can use other options to find out other things like dash L means count a number of lines so if I do enter you see actually there's still so definitely it's way less than what we had just now we have 9000 lines just now now we only have like 1600 lines but it's still a lot and we want way less than this maybe we can filter it some more you notice that whenever that you someone login there's always this word accepted public key for right that you can go out and see accepted public key for whore for Julius for whoever it is right so let's filter by using grab accepted public key for and let's see number of lines 461 much better right you get one quarter of what you had just now but there is still a lot a lot of data there and also there's a lot of this noise that you don't want to see like you care about like the beginning part much 21 Julius pts blah blah blah you don't really care about those right who's logging in which is like can be Julius can be Turing can be Kari Newton or whoever it is right so there's a way to actually get rid of those noises by using these two called SED so what said like said is actually a stream editor like set stands for stream editor so like if you open the main page for set it will tell you stream editor it's a stream editor that builds on top of the old ED editor so if you use VIM like how many of you use VIM or will force to use VIM okay so VIM is actually like it comes from V V is actually the visual mode of ED so if you know VIM commands you would kind of know like set commands because set is based on ED commands and ED is basically like the common ancestor of both VIM and set so the most common ones that we're going to use right now is substitution because we want to basically like from the logs right you want to be able to see all this like garbage and remove them and just be left with what's useful so to do that you can actually run this so like actually okay so if I do set substitute accepted public key okay I forgot this accepted public key 4 and then this do you notice it's a lot cleaner like this is the command that we did just now so do you notice that all the garbage in front is actually gone now right like last time if we don't use this set command oops if all this garbage in front the time the SSHD but you don't need those you just need the name say we do this you see that like all the thing in front is actually gone and the way to do this actually we use something called regular expression how many of you have heard of regular expression before how many of you actually use regular expression how many of you can read regular expression it's kind of complicated tool but it's really useful to have in your toolbox basically so this is the command inside that we use just now you see I use s slash something slash then there's nothing there slash so actually this is the command format use s slash the regular expression slash what you want to substitute with so in this case I substitute it with nothing but say I substitute it with like hello then it will actually change it with hello begin with hello because I replaced this whole pattern except the public key for with hello so there's a set command but then we still haven't gone through this like regular expression part right so it's like if you don't know what's regular expression it's actually like it's just some construct it's you can call it it's kind of a language but it's not really a language it's just a construct for you to actually match different text against patterns so you can specify some patterns and you can see which text actually match that pattern and you can that way you can actually like manipulate text usually they are surrounded by this slash thing so like you see here this is a regex and you actually like they are surrounded by the slash and most eski characters would just carry their normal meaning but different like some special some other characters usually is a punctuation would have like special meaning to it la so like here if I do accept a public key for those are eski characters they are just like normal alfan numeric letters they usually just have the normal meaning but for like there's a dot there you will see there's a dot there's a star like as the risk they will actually mean something different so for the meaning of special characters I have a list here and in case like you're new to regex you can actually open this website and learn about more regex but right now you're going to go to like some quite basic regex so you see there's a dot dot just means like any single character except new line so it wouldn't match new line but it would match with anything else you have a star which just means zero or more of the previous like special pattern you can add a star to it so it just means like one or more so if you look here dot as the risk what could that mean dot means any character other than new line right and as risk means the preceding pattern you get zero or more so it can be nothing it can be one character it can be two character it can be like many characters it doesn't matter as long as it's like zero or many but if you want only one or more you can use dot question mark so that means that any character except new line it must be at least one so it cannot be nothing there's also like this square bracket thing square bracket it means like any of the characters inside so if I do like square bracket A, B, C it means either A, B or C and I can also do like a special one like A dash C it just means within the range A to C inclusive it's those character set that I want or I can actually do this with the like the thing that looks like a pipe that just means all so like other Rx1 or Rx2 and the Rx1 and Rx2 can actually regex themselves or I can put that carrot thing which means like match only at the start of the line or dollar for the end of the line we're not going to use like many of those so you can learn yourself after this workshop because there's really a lot of like regex and there are many more than this like just look ahead look behind and other stuffs so the thing is there's actually two different regexes right now there's something so called obsolete regex and there's what called modern regex so if you use the VIMS regex all of them are a bit weird like you have to put backslash before many things and this is because they're using something called like obsolete regex but for set thankfully you can pass on this flag called dash E right it actually tell set to use the modern regex and it's the one that is like common in everywhere basically like you can use it in Java, Perl, Ruby, Python like they use all the modern regex and if you want to explore more about this you can actually go to man re underscore format and they will tell you all about like this like obsolete regex and the modern regex but for our purpose I'm just going to show you like how to use the modern regex so like set dash big E because that's much easier to read so looking at our regex just now we have this regex like .asteris accept the public key for so for this one as I explained to you basically it would match anything before accept the public key for and then you would do everything so the problem now is what if your username is also accept the public key for so the problem with this is that .asterisk is by default greedy like so called greedy so what does greedy mean it means that they would try to match as much character as they want so imagine if the username is accept the public key for then you would have it twice so one way to really easily see it is like there's this website like regexer.com for example so let's type that in so this one example you see that it will actually match against this right and below you can see like all the different meaning of whatever we put in but imagine if the username is also accept the public key for okay so after this you have like some other stuff afterwards right so imagine if instead of brood you see that like actually the dot will actually match as much as possible all the way until you can match the accept the public key for and this is the problem because then we would actually lose the username I mean like of course you wouldn't have someone whose name is that but it's just like a possible like flaw in our program so one way to solve this to actually be more specific with our patterns and also use something called capture groups so here actually we can create this very long regex so it looks very daunting so let's look it at a regex with a regex debugger so this is a different website from just now but it's kind of similar so you can see here what's going on right like you see this regex and they would show you like which part matches which part so you see this part right like this part would matches with this part and everything and like you have this part so you see this there's this this thing right matches with this see like this green thing is this part you see like they say first capturing group and this part is the match one so that's the first match this is the second match second capturing group and this part and there's this part also the third capturing group so I'll just explain to you very briefly about what the regex does so first it's still dot as the risk so that's still fine same right just match as many as possible and then you would want to see for accepted public key 4 and after that you have this dot as risk again which means match as many as possible until you encounter space from space so now you're being very specific with your regex so you match accepted public key 4 and you also match for from so definitely what comes in between accepted public key 4 and from must be the correct thing so if you look at the last at the last line here the example even when my username is accepted public key 4 they would still be able to recognize that it is like indeed the username because it comes in between accepted public key 4 and from so when you are being very specific then you can find more things any question? it is a very useful thing so for example you can go to the regex debugger and you can okay let's increase this you can actually see how they try to match it so see like they will try to find first for A because the dot as risk will just try to match as many as possible so let's like jump here so like you see there's SHA right it will try to match A by kan so it wouldn't keep on going until it finally matches this one once it do and then like it would try to go forward and try to match more oops okay so it would just keep on going so you can actually like see for all this what they are doing and then it would look for space from space it can't find it so it would like keep on going like that all the way until here you can finally find from that it would do all those if you're interested you can always like check this out afterwards okay wait I actually need the I need this okay okay so so instead of doing this let's try doing whatever we type there so accept the key for let me open the yep so basically I have this and then replace it with nothing what do you think will happen okay hold on something's not right okay I didn't put dash E what happened what do you think happened when I ran this what do I get all these blank lines yes because now I'm matching against the whole line and I'm replacing the whole line with nothing so now like I end up with nothing and that's not what we want either right so what we want now is capture groups so if you notice in our you notice that I actually put these things in between parenthesis and you say it's the first capturing group second capturing group third capturing group and so on and so forth like basically whenever you put this you're telling Ragex I'm interested in whatever is inside this please capture capture those and you can actually access those in the set replacement by using backslash in the substitution so like the first capture group is backslash 1 backslash 2 backslash 3 and so on and so forth so looking at this thing the username is what's the what's the capture group number for the username the first one right so capture group 1 correct so now what we need here is to output the first capture group now if press enter you see everything is just a capture group because if you look here like see it captures route captures route captures like this except a public key fall so now we replace basically we replace we have this whole thing but we're just interested in the first capture group so we just bring the capture group and that's like one example of doing it any question so far yep maybe if you notice also here I'm actually also capturing something else like for example second capture group here is actually the IP address right group 2 and group 3 is actually the port number so here if instead of capture group 1 you replace the capture group 2 you should get the IP address right if I do capture group 3 instead I should get the port numbers and I can do other stuff for example I want to like have 1 comma the IP address comma the port number and I should get those also name comma IP address comma port number so you just put into your substitution whatever you want but in this case I'm only interested in the username so it's just backslash 1 right so that's all good and nice and actually like you can read up more like about regex actually like it's quite an interesting topic can be quite daunting but ya it's very handy to have and back to here so like we already have this right and it's quite useful you already have all the usernames but it's still a lot of usernames right we may want to get like more like we want to get statistics out of this so for example you want to see like okay and another thing that I need to bring up is that set is actually quite powerful also so instead of doing this instead of just like using group and then set it's actually possible to do everything in set okay so look at what's happening here so what you do is actually first you look for this pattern accepted public key form the bank says look for lines that doesn't match this D means delete so delete all lines that doesn't match this which is the same as doing group with this pattern right so if I actually do that so ya so if I do this I should end up with the same thing also like we can do everything with set actually but usually you wouldn't want this because like imagine like typing it's much longer than if you just like group accepted public key form right you don't even have to remember the bank D whatever like that's why we usually just use group instead of that and there are actually like many more different patterns so you can just do like man SED and they will actually show you like all this different commands ya they have like all this number WT whatever and then we're going to go now to like more advanced data wrangling so we have this thing right they will print all the user names but let's say instead of this we want to see the common user names like what are the common user names what we can do is actually we can pass this to sort so sort will well obviously like sort the user names so you see like this Turing XXX Newton Hall sorted by the alphabetical order like lexical graphical order and then what you can do is you can use this common called unique so what unique does it will actually remove all the lines that all consecutive lines they are the same so you see here like many Turing many Hors many Julia so if I run this I just end up with all the unique names but the condition to pipe to unique is that I have already been sorted in the first place that's why you need sort first and there's one more thing that I can do I can do dash C for count so what happens if I do this I get a count so there are 88 curry 70s Einstein and so on so forth so now if you look at this you know that like oh there are this many user names in the fast like look at this like you just using like all these little tools that does only one thing something very little but we can achieve so much using these tools right like this all the unique philosophy is about so we've already seen this common like this common user names but imagine if way more user names they say like you have like 1000 different user names and you just want to see like who's the top 3 how would you do it okay can you just like can someone just explain to me how would you do this like manually if you were to like do this manually you want to only see those like the top 3 user names count is the highest so you need to sort them right and then you need to hit entry so you know that the next part should be entry but you need to a way to sort this so there's actually sort actually takes in flex as well so if I do sort it means sort by numerical order instead of lexical graphical like for example have you ever had this problem you have like you have file whose name begin with a number like 1 2 3 4 5 6 7 8 9 10 and instead of for your and then your program instead of sorting it 1 to 10 it does 1 10 2 3 4 5 the reason being because they're using lexical graphical order so like longer like longer comes first even though it's actually should be like done by number so if you do sort dash n what they do is they will actually sort based on the numerical order and then K11 so K means you only sort by the column that's what K means and then 1 comma 1 so what does it mean it means that 1 the first one means I want to sort based on the first column and then comma 1 means I'm only interested in the first like just in that field so only 1 until 1 like if you go to mansort and you look for the dash K see they say sort based on a key so the key is like location and type so in this case I'm just giving it like the location which is the index so like the first field and only the first field and if I do this look now I sort based on the first field okay the field like the meaning of the field is basically like separated by space so in this case it's the count right space the username so there are 2 fields the first field is the count and the second field is the username so by saying dash NK 1 comma 1 I just want to sort based on the first field and by the first field only so now that I have this okay because these are being sorted in increasing order so instead of head actually I want tail so I just do tail dash N3 and boom I got it and if you have tack you can actually like use tack to actually make it like be in this kind of order so what if I actually want the 3 list common ones instead what should I change in my command if I want the list common ones instead of the most common ones anyone yeah change the tail to head so if I do that then I'm going to get like the 3 list common ones yes dash R so if you do this then it will sort ascending if you add dash R to it it will sort in reverse order you can actually look at all these things so like they have this dash R somewhere R reverse any question so far like do you notice like basically based on this now you can make something like very very advanced right using this okay I think we don't really have time to do AWK but if you want you can also look at AWK like AWK can do stuff that step cannot because AWK is actually like a fully it's actually a it's a full programming language so you can do stuff that you can't do with SED so if you're interested you should look it up but otherwise you you can use this and like one thing that I want to show you also that's quite cool is that your shell by default has a calculator language it's called BC so if you run it you can do all sorts of things so like what's 5 plus 5 10 what's 2 to the power of I don't know 24 sorry 2 to the power of 24 it's this number how about 2 to the power of 60 2 to the power of 60 it will tell you so if you are able to actually like so if you do this right and then so let's do another set here so okay let's substitute now okay do you know what I just wrote here can someone explain what I just wrote okay what do you think this would do let me run it so what do you think that do what do you think that I just do basically I'm looking for like a space like 2 things separated by space right and I just want the first thing so it's the same as like I'm just taking the first part and then there's this cool command called paste so I'm skipping through org because I was going to explain how to do this using AWK because it's much nicer but let's just using SED because you can also do it using SED and the cool thing is that okay there's this part paste sd plus dash so what do what does paste do let's look at the man page man paste merge lines of files so what does it do let's look at it so what do you think does it do okay so basically what it does right now is taking this lines right because the output of the first SED is just lines containing numbers correct so what it does it change the new line into a character that I specified which is a plus here the dash just means that I don't want to read from any file just read from standard input because I'm piping to it so now that I have this this thing is just a mathematical expression 88 plus 86 plus 76 plus blah blah blah right so what I can do is I can pipe this to BC and if you remember just now BC is calculator language so I can actually sum them up and BC will tell me oh the total is 461 anyone okay any question so far they all understand what's going on so if I repeat from the beginning catlock means like I want to get the content of lock group SSHD so filter only the lines containing SSHD and then filter by accepted public key 4 using set replace all that using this regex so that I can only get the user names sort the user names get they like remove the duplicate user names and give them the account sort them I mean actually I don't really need right now cause like actually I don't need to do that to do a plus right but then after that I actually using paste I turn the new lines into pluses and then at the end I just do BC which would just take in the expression and give you the result so you can do math with paste how awesome is that so I have some exercise for you so okay let's do this find the number of words in okay so I'm not sure but every unix should have this so if you do cat user share slash dict slash words you share this like long list of dictionary so what I want is please tell me the number of words that contain at least 3 A and it doesn't end in apostrophe S and remember you can actually tell the number of lines in the output using WC-L so in total there are like 200,000 words in my in my Mac and if I do this in my Linux machine it's 234,000 so around there so yeah how do you think you would approach this problem okay let's do the easiest one first you don't want to have apostrophe S ending how do you think you should do it anyone okay the way is actually this so like if you look at the manual for G-R-E-P there's this option called dash B which means invert match meaning only give only give me lines that doesn't have the pattern so with the knowledge of dash B how would you how would you do it then so here what do I do G-R-E-P and then dash E and dash B oh yeah by the way if you want to use the modern reg X you should also do like either you do dash like you should do dash E by the way in grub as well if you want to use the modern reg X but yeah in this case you don't need so okay so what do I need here S so now if I look at the number of lines it should be lesser right or apparently all of them doesn't have the posture VS like if I do it on a Mac it should make a difference it doesn't something is not right okay all of them already doesn't have the posture VS but say now I want the words that contain at least 3 As how would you do this do you think that would work let's try I'm getting nothing okay let's look at the very first okay this is an example of something that contains 3 As it's actually inside so by the way I should be able to match this yeah this one is look yeah correct you're right it's looking for 3 consecutive As but what I want is for the sentence like for the word to just contain 3 As anywhere group A would just mean 1 As but I want exactly 3 As how is it star A what does this mean okay by the way for group A you should use this one by the way so anything it's not star means okay what does star mean again let's look at the okay what does star mean again 0 or more of the preceding match which is the preceding pattern so the first star it got preceded by nothing this would not return anything also sorry dash E yeah see repetition repetition operator operant invalid because you put a star but there's nothing before but star here is a modifier of something that comes before it so what should come before the first star dot dot star means match anything no matter how long so we should do this for all of them so it means that we just want anything that contains 3 As inside no matter where there is that's why it's just A A and then you can put anything in between or after or before and if you get a long list of things that contains exactly 3 As so originally you have like 200 and something so for this one yeah there's only 7000 words exactly 3 As so now how about this one what are the 3 most common last 2 letters of those words how would you do this okay first of all like okay can you explain to me man terms how you would do this in terms of piping so you have all these words that has 3 As what do you do so that you can calculate you can find out what's the 3 most common last 2 letter what do you need basically you want to get the last 2 letters correct thankfully tail can do that so if you do man tail there should be this thing called dash B no dash C dash C it means the count so like if I do echo SD SD and then I do tail dash C 2 is it B I think it's B no wait what is it dash N oh sorry it's no it's not dash N it's C sorry it's correct it's C so okay hold on ya I do it oh okay you actually can't do that but ya you can filter just for the last the last 2 things so the way to do it is actually to do this grub again dash E and then there's another there's another flag called dash O dash O means you only print whatever matches because by default grub would print the whole line correct but now we just want to print the matching one so what do you think is the pattern to match just the last 2 using this what do you think how do you think we can match the last 2 words sorry the last 2 letters or you can do dot dot dollar same thing so this thing should only match the last 2 then it would just give me all the last 2 using this and then okay so now we already have like all the last 2 endings right how do we find common ones what should we use next we did this just now right so we should pipe it too not yet unique requires the data to be sorted so you need to sort first before you can unique so if you sort unique then you only got this so okay what do you need now dash C so you get all this right what do you need to do next sort again sort by what R R K R N K 1 1 because you want the first field then you would sort by that right and then what do you next to get the 3 most common hit dash N 3 and there we go we got the 3 most common words sorry the most common hmm we use ah yes when you use double quote they would still expand globs and you don't want that whenever you use dot as the reason the as the reason it's actually interpreted dash ya and then so how many of how many of the 2 letter combinations are there how do you do that so just now we use this to find the 2 letters all the 2 letters combination right how do you know how many 2 letter combinations are there it's kind of similar so we need to find out like all the unique 2 letter combinations right and then you can just find out the number of lines be the number of 2 letter combinations which is 150 and the last one is actually it's quite challenging so I'll leave it as an exercise so I hope you all have learnt a lot of things today right especially how to use shell so if you have any question you can just email me like you should have my email from whatever I sent so terima kasih terima kasih untuk datang terima kasih untuk datang selanjutnya yang seperti Hecker School Part 2 dan tolong rasa maklumat seperti ini URL