 So I came across a command line utility called csv-tk. I like this quite a bit. It's somewhat similar to another set of command line utilities called TSV Toolkit. TSV Toolkit was written in D, I believe, the D programming language. And I haven't had any complaints with it. csv-tk is something very similar. It's written in Go. What I'm going to show here is me using csv-tk for the first time. You'll see that I make a few mistakes as I go through. But in the end, I'm able to show a couple things like selecting columns, filtering, sorting, renaming, and some others. So here's the csv-tk GitHub page. The first thing to do is download. In the download page, you'll see a Linux 64-bit link. That's the link that you want to use to download the program. So here I'm just following along with the installation instructions in the readme. The first thing is to tar-xvzf, which is a way to decompress the file. And now we should have a binary in our directory where we decompress the file. Now what you want to do is you want to move that binary to your user bin folder so that any time you type csv-tk, it's not just going to be looking in the current directory but any directory on your Linux system. So csv-tk will be a binary you can execute from anywhere. So user vocal bin is where I usually copy my binaries to. So here's csv-tk. So I'm just going to look through some of the examples on their web page. One thing that's interesting is they have some kind of command line helper. I never actually used it. But I think that it looks like there's some kind of auto completion functionality. So I might check that out later. You'll also need to get the diamonds dataset to follow along. So this is just a wget-get command. I've used this before in other videos. If you haven't seen this before, this is just a quick clip on how to get that. So now we have csv-tk installed, and we have the diamonds dataset ready to manipulate. So the first thing to show is just a pretty print of the dataset. Here I'm doing a pretty print and just showing the head of the data frame, so just the first few lines. And you get just the header, and you get some dashed lines underneath those column names. Looks nice. Here I was trying to get a little bit fancy and do a summary. I ended up quitting this multi-line summary and just sticking to one line here. So this is doing a summary on the depth column, doing a sum grouped by cut, and then doing a pretty print after that. And this didn't work out so well. The reason is that in csv-tk you need to mark which column you're going to be using with the dash fields or dash-f command or flag. So now I'm just going to do a sum of another column. This is price. Next is the select functionality. So I call it select, but in csv-tk they call it cut. And this is like the Linux command cut. So you say you want to cut field one. That means you want to select the first column in the data frame, and then I just pass it to pretty print. So you can do it by number, which would be the column index, or you can do it by name. And you can have multiple names. So here I'm selecting by cut and price. They have this interesting functionality to do fuzzy fields. This just lets you do something like a wild card at the end of your column. So you don't need to type the full column name. You could just do star. Anything that starts with a C or a P would get selected. Now it's filtering rows. So you have to have the dash-f here. And you want to also have dash-p. And p is the parameter you're passing to your field. So for example, here the field is cut. And the parameters we want to filter on are ideal and premium. So you'll see in the cut column all that there is is ideal and premium selected. So that is how you do a filter for columns that are characters. Near the end, there's also a different filter for doing numbers. But this one is more like they call it grep because with grep, you're usually using characters. Another cool thing about filter is that you can use a text file to supply the parameters that you're going to be filtering on or grepping on. So here you do a capital P and you pass in the name list. Here I just did ideal, but I'm going to add premium just to see if that works. And now we're just fine. You'll see both ideal and premium are selected in cut. I wanted to know a little bit more about this file. I tried doing kind of like a CSV format, so separating my column to see if that would work, and it didn't. I also tried just making a new line to see if it was picking up on carriage returns. Looking for that kind of CSV and this didn't work either. You'll see only premium is selected. So it's just got to be a list where each new row in that file is what you're looking to select on. And that's a really neat functionality. I could see that being very flexible. There's also this interesting thing here, this bullet that says you can remove NAs, which is really common with a command they list there. I'm not showing that here. I just went on to rename column names. So this is the rename column names functionality. It's pretty simple. It's just rename and then you pass the fields that you are selecting and what you want them to be named with the dash N command or flag. So here I'm taking care and cut and calling those A and B now. And this isn't changing the actual text file. It's not writing anything. This is all just being piped from one command to the other. At the end, you can cat this to a file and they also have an output command, but I don't show that here. So mutate may be a word that's familiar to users of deep player. It's simply just making a new column. Now I'm not actually taking a column and applying some kind of addition or some kind of math operation to it. All I'm doing is simply making a copy of a column. I'm not sure if CSVTK has the functionality to apply some kind of math operation on a column. But here I just played with mutate and just made a copy of a column called something else. So here I just made a copy of cut and called it thing. Next is sort. It took me a little while to understand how sort worked. You'll see here that I'm playing with a few different parameters. They have a sort dash K1, so I'm sorting by the first column. But they also have a, here I'm doing a dash K1 price, but the Eric shows up that I can't have both a index and a name selected. You have to either go all indexes or all names. So here I'm sorting by carat and price. Then I switch it from price to carat to see if that changes the order of things, if it's respecting price first and then carat. Here I'm working with what goes on after these colon, so colon N and colon N. So the colon N is actually to sort by number. And when I was playing with this and just letting the screencast run, I didn't realize that there was a difference between the N and the little N. And it seems like there's an important distinction here. Also, NR is to run numbers in reverse. So there's N, which is going in ascending order, and R is descending order. And then there's a capital N, which is for natural order on characters. I guess that would probably be alphabetical order. So I didn't have this completely figured out. You'll see I'm just playing with some of these different parameters. But the documentation here is pretty clear that there is a sort by number, which is a lowercase N. And then for reverse number is NR. And then you can do this natural order or alphabetical order with capital N. So here's a filter by numerical rows. So grep was the way to do character-like rows. Filter is the way to do numeric rows. And when you do this, you need to use quotations. So it's taking this expression, carat greater than 0.26. So I piped out too much here. I'm going to make sure to just grab the head. You'll see everything in the carat column is greater than 0.26. Here I'm just going to change this to 0.32. And you'll see everything's greater than 0.32. Now I'm showing lots of plotting fails. I tried quite a few times. I didn't practice using CSV, TK at all beforehand. I did get a histogram a couple of times. And so the plotting functionality is really neat. It's just going to take me some time to figure out a little bit more. So this is a tutorial page. They show you how to output to CSV files and a number of other things that I didn't show in this screencast. So if you want to learn more, check out that tutorial page. Overall, I'd say that CSV, TK is probably on the top of my list for command line data manipulators. I think that it's intuitive. I like that the commands can be piped to each other. Thanks for watching.