Bash: Logging stdout from multiple xargs parallel processes to separate log files
Asked Answered
A

2

5

I am processing a text file with multiple parallel processes spawned by xargs. I also need to capture the stdout from each process into a separate log file. Below is an example where the output from each process is interleaved into a single file -- not what I want.

Ideally, each logfile should be numbered by the file line number, that is, logfile-1, logfile-2, etc.

cat inputfile.txt | xargs -n 1 -P 8 ./myScript.sh | tee logfile

It would be nice to avoid an external wrapper script if possible, but if there is a way to wrap myScript with a here document, that would work.

Aholla answered 2/10, 2014 at 18:29 Comment(1)
Inside myScript.sh do an exec > logfile-$$ or some such? Basically the script controls its logging rather than xargs attempting it.Decanter
H
8

Try this:

nl inputfile.txt | xargs -n 2 -P 8 sh -c './myScript.sh "$1" > logfile-$0'

This assumes each argument in inputfile.txt is on its own line and contains no spaces. The nl command numbers each line, which pairs each argument with a unique number. The xargs commands takes two arguments at time, the first the line number, the second the corresponding line from inputfile.txt, and passes them to sh. The sh command uses the arguments to generate the output file name and the argument to myScript.sh respectively.

Hanaper answered 2/10, 2014 at 21:42 Comment(3)
"$1", rather than bare $1, but very much the right idea.Allometry
It probably doesn't make a difference since xargs splits arguments at spaces, but I suppose inputfile.txt might have quoted arguments.Hanaper
@Ross, Clever solution. I also want the output on screen in addition to the log files. It seems to work with a small change: nl inputfile.txt | xargs -n 2 -P 8 sh -c './myScript.sh "$1" | tee logfile-$0'Aholla
V
4

You could use GNU Parallel instead and its -k option to keep the output in order, in a single log file:

cat input | parallel -k ./myScript.sh > file.log

You can add -j 8 after parallel to keep 8 cores busy, but it will keep all cores busy by default anyway.

Vivianne answered 2/10, 2014 at 21:50 Comment(3)
Have you looked at the source to GNU parallel? Makes rat nests look like models of good organization.Allometry
@CharlesDuffy I have actually and I agree it is tough to read, but my experience is that it works a treat every time I use it. I also find it pretty hard reading the Linux kernel too... :-)Vivianne
@CharlesDuffy Improvements are always welcome, as long as it does not break current functionality.Anglian

© 2022 - 2024 — McMap. All rights reserved.