Alert icon
We're changing our privacy policy. This stuff matters.  Learn more  Dismiss

Part 1: GNU Parallel script processing and execution

Loading...

Sign in or sign up now!
Alert icon
Upgrade to the latest Flash Player for improved playback performance. Upgrade now or more info.
19,341
Loading...
Alert icon
Sign in or sign up now!
Alert icon

Uploaded by on Jun 21, 2010

GNU Parallel version 20100620 http://www.gnu.org/software/parallel/ is a shell tool for executing jobs in parallel locally or using remote machines. A job is typically a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables.

If you use xargs today you will find GNU parallel very easy to use as GNU parallel is written to have the same options as xargs. If you write loops in shell, you will find GNU parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel. If you use ppss or pexec you will find GNU parallel will often make the command easier to read.

GNU parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially. This makes it possible to use output from GNU parallel as input for other programs.

For each line of input GNU parallel will execute command with the line as arguments. If no command is given, the line of input is executed. Several lines will be run in parallel. GNU parallel can often be used as a substitute for xargs or cat | bash.

Link to this comment:

Share to:

Uploader Comments (OleTange)

  • What shell are you using? Curious with respect to the file renaming syntax using {} and {.}.bz2

  • @thomasknauth the {}, {.}, {3}, and {3.} are all GNU Parallel specific. It is not dependent on the shell.

see all

All Comments (27)

Sign In or Sign Up now to post a comment!
  • Im using the program and I love it! Is there a way I can use my distributed filesystem to access data for the tasks instead of passing it the way it does now? I use glusterFS and I just want to process in my mount

  • This is lovely stuff, thanks for sharing... think ill try this on my data mining grep script.

    Is your time script available for borrowing?

  • @kalinin76 test -z checks if a string's length is zero.

    In this case, the string being tested is probably user-configurable and can be empty.

    This does not look like a mistake, and they also didn't mean test -d because mkdir does nothing if you try to create a directory that already exists.

  • Thank you for creating what looks like an excellent tool, and for providing a fantastic tutorial. I'm looking forward to trying it out.

  • @kalinin76 it says check if that directory exists or make that directory if no error exists (the -p).

    same as test -z /usr/local/bin && mkdir /usr/local/bin

  • @macemoneta

    I think the focus is on the part where he used multiple machines. Anyways, xargs -P would mean that porting to parallel is even easier.

  • Yesterday, I started to convert my scripts to use parallel -- it is a great tool and so far conversion is smooth. Thank you very much!

Loading...

Alert icon
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more