 I know that many data scientists have a set toolkit for manipulating data. I want to ask you to take a second to set those tools aside and look at something a little unfamiliar. The tool is Miller. Miller, abbreviated as MLR, is a command line utility for CSV parsing. It is similar to XSV. If all you need is some filtering, selecting, and sorting, then XSV may be all you need. Let's start simple. MLR cat. This allows data to be passed through as is. Formatting can be added by adding some flags. You can even add bars, but I don't normally add bars because this is something that pushes the data wider than I need it to be. I notice that MLR has many commands that are analogous to common UNIX commands. Along with MLR cat, there is MLR head, MLR tail, MLR cut, and others. MLR cut is how column selection works with MLR. If I were using dplyr, this would be just like using the select function in dplyr. Or if I were using SQL, this would be like the columns I'm selecting right after the select statement. MLR cut requires a field flag, dash F, then field names. The order that columns are printed in can be forced with the dash O flag. Fields can be dropped with the dash X flag. MLR also supports filtering. This is a big difference between XSV and MLR. Where XSV forces rejects for filtering, MLR uses logical symbols. If you are coming from standard data frame manipulation libraries, then this feature will feel familiar. When using filter, the expression is surrounded in single quotes and the fields being used in the filter start with a dollar symbol. MLR uses the put command to make new columns. This would be analogous to Panda's assign method or dplyr's mutate function. Now, let's talk about chaining, the then keyword. It is common for me to want to sort the data, then take the top rows. MLR has great support for descriptive statistics and even allows for group aggregations. In my mind, this is one of Miller's killer features. I can use Miller to calculate group counts, min max, and even various percentiles using MLR stats one. The stats one is for statistics that require only the column of interest to calculate. There is also a MLR stats two for calculations involving two or more columns. If you come across a calculation that is not provided by Miller, then you can use Miller's domain specific language to make a custom function. The author of Miller calls these out of stream variables. Miller reshape allows a clean interface to pivot data between long and wide format. I have just scratched the surface with Miller. It is a powerful and fast tool. I highly recommend any data scientist or data engineer to check it out. I was surprised by the functionality it has to offer. Thanks for listening. See you next time.