Splitting command line args with GNU parallel
Asked Answered
B

4

49

Using GNU parallel: http://www.gnu.org/software/parallel/

I have a program that takes two arguments, e.g.

$ ./prog file1 file2
$ ./prog file2 file3
...
$ ./prog file23456 file23457

I'm using a script that generates the file name pairs, however this poses a problem because the result of the script is a single string - not a pair. like:

$ ./prog "file1 file2"

GNU parallel seems to have a slew of tricks up its sleeves, I wonder if there's one for splitting text around separators:

$ generate_file_pairs | parallel ./prog ?  
  # where ? is text under consideration, like "file1 file2"

The easy work around is to split the args manually in prog, but I'd like to know if it's possible in GNU parallel.

Baptlsta answered 6/6, 2011 at 16:45 Comment(0)
C
88

You are probably looking for --colsep.

generate_file_pairs | parallel --colsep ' ' ./prog {1} {2}  

Read man parallel for more. And watch the intro video if you have not already done so http://www.youtube.com/watch?v=OpaiGYxkSuQ

Coaction answered 6/6, 2011 at 21:25 Comment(4)
As I read the initial question, it looks like "generate_file_pairs" would output with the quotation marks. --colsep will not remove the quotation marks, correct? Assuming the quotation marks surround the text, is there a way to trim them with parallel? For example, the following doesn't work: echo '"file1 file2"' | parallel --colsep ' ' ./prog {1} {2}Mustard
From version 20140722: echo '"file1 file2"' | parallel --colsep ' ' echo '{=1 s/^"//=}-{=2 s/"$//=}'Coaction
@OleTange Is there some discussion or docs that talks about the default separator behavior?Bronchopneumonia
The default separator is \n. It separates on newline and nothing else.Coaction
K
3

Quite late to the party here, but I bump into this problem fairly often and found a nice easy solution

Before passing the arg list to parallel, just replace all the spaces with newlines. I've found tr to be the fastest for this kind of stuff

Not working

echo "1 2 3 4 5"  | parallel echo --
-- 1 2 3 4 5

Working

echo "1 2 3 4 5" | tr ' ' '\n' | parallel echo --
-- 1
-- 2
-- 3
-- 4
-- 5

Protip: before actually running the parallel command, I do 2 things to check that the arguments have been split correctly.

  1. Prepend echo in front of your bash command. This means that any commands that will eventually be executed will be printed for you to check first
  2. Add a marker in the echo, this checks that the parallel split is actually working

> Note, this works best with small/medium argument lists. If the argument list is very large, probably best to just use a for loop to echo each argument to parallel

Kilar answered 24/3, 2021 at 2:13 Comment(1)
Thank you from over 2 years later!! In xargs, I could just do "-n1" to split into newlines on every space, "-nX" to split once every X spaces. For some reason, parallel doesn't work the same in this regard. None of the answers could help me get "hi there" on separate lines, and this is so much better of a solution than scouring man pages to come up with a 20x larger command that will break next time I update bash or parallel.Sorgo
R
2

You are looking for -n option of parallel. This is what you are looking for:

./generate_file_pairs | parallel -n 2 ./prog {}

Excerpt from GNU Parallel Doc:

-n max-args
    Use at most max-args arguments per command line. Fewer than max-args 
    arguments will be used if the size (see the -s option) is exceeded, 
    unless the -x option is given, in which case GNU parallel will exit.
Reservoir answered 6/6, 2011 at 18:5 Comment(1)
This won't do the splitting. e.g.: echo hi there | parallel -n 2 echo {2} x {1} => x hi there (There is no {2} in this case.) Using --colsep: echo hi there | parallel -n 2 --colsep ' ' echo {2} x {1} ==> there x hiCrapulous
H
2

In Parallel's manual, it is said:

If no command is given, the line of input is executed ... GNU parallel can often be used as a substitute for xargs or cat | bash.

So take a try of:

generate command | parallel

Try to understand the output of this:

for i in {1..5};do echo "echo $i";done | parallel
Hitchcock answered 14/12, 2016 at 8:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.