In our work, we often need to generate lots of similar commands at once. There are lots of ways to do this, I just want to write about the one I used the most often: Parallel

basic syntax

The file name: {}

remove extension: {.}

parallel echo {.} ::: A/B.C

Output: A/B

removes the path: {/}

parallel echo {/} ::: A/B.C

Output: B.C

keep only path: {//}

parallel echo {//} ::: A/B.C

Output: A

remove path and extension: The replacement string {/.}

parallel echo {/.} ::: A/B.C

Output:B

To indicate that everything that follows should be read in from the command line: :::

e.g. “parallel gzip ::: *” means to gzip all files in the current working directory, while “parallel gzip *” wont work. You need to include “:::”.

my examples

1.

For example, I need to generate simulations using Rscript, and then run it together using 10 threads.

With multiple input sources the argument from the individual input sources can be accessed with {number}:

We save commands to a file named cmd.

And we want all jobs in file to run in Parallel. If more jobs exist than jobs allowed, a queue is formed and maintained by Parallel until all jobs have run.

parallel echo Rscript test.r --s {1} --b1 {2} --b2 {3} --h {4} --cor {5} ::: 20 40 60 80 100 ::: 20 40 60 80 100 ::: 20 40 60 80 100 ::: 0.05 0.1 0.15 0.2 ::: 0.2 0.4 0.6 0.8 1 > cmd

less cmd | parallel -j 10

2. print first field to new file

parallel awk '‘{print $1}’' {} > {}.xxx ::: random10*

helpful posts on useful examples:

https://gist.github.com/Brainiarc7/7af2ab5e88ef238da2d9f36b4be203c0

https://github.com/LangilleLab/microbiome_helper/wiki/Quick-Introduction-to-GNU-Parallel