Yesterday it was suggested to me that using command substitution in bash causes an unnecessary subshell to be spawned. The advice was specific to this use case:
# Extra subshell spawned
foo=$(command; echo $?)
# No extra subshell
command
foo=$?
As best I can figure this appears to be correct for this use case. However, a quick search trying to verify this leads to reams of confusing and contradictory advice. It seems popular wisdom says ALL usage of command substitution will spawn a subshell. For example:
The command substitution expands to the output of commands. These commands are executed in a subshell, and their stdout data is what the substitution syntax expands to. (source)
This seems simple enough unless you keep digging, on which case you'll start finding references to suggestions that this is not the case.
Command substitution does not necessarily invoke a subshell, and in most cases won't. The only thing it guarantees is out-of-order evaluation: it simply evaluates the expressions inside the substitution first, then evaluates the surrounding statement using the results of the substitution. (source)
This seems reasonable, but is it true? This answer to a subshell related question tipped me off that man bash
has this to note:
Each command in a pipeline is executed as a separate process (i.e., in a subshell).
This brings me to the main question. What, exactly, will cause command substitution to spawn a subshell that would not have been spawned anyway to execute the same commands in isolation?
Please consider the following cases and explain which ones incur the overhead of an extra subshell:
# Case #1
command1
var=$(command1)
# Case #2
command1 | command2
var=$(command1 | command2)
# Case #3
command1 | command 2 ; var=$?
var=$(command1 | command2 ; echo $?)
Do each of these pairs incur the same number of subshells to execute? Is there a difference in POSIX vs. bash implementations? Are there other cases where using command substitution would spawn a subshell where running the same set of commands in isolation would not?
bash
is implemented. However, I would note that subshell != process; a subshell (in the sense of a new scope for variables) is not required to spawn a new process to run it. (This is the third point made in the accepted answer to your linked question.) – Monocarpic( ... )
explicitly creates a sub-shell; the commands within the parentheses must be executed in a sub-shell (meaning that any changes made to variables etc must not affect the main shell). That used to be done by forking and letting the child execute the contents of the sub-shell script while the parent waits for it to complete. A shell might avoid that if it has good enough scoping abilities. – Courtly