Local variables in bash: local vs subshell
Asked Answered
S

2

10

As far as I know there are two ways to create local variables in a bash function: create a subshell or declare every variable as local.

For example:

# using local
function foo
{
  local count
  for count in $(seq 10)
  do
    echo $count
  done
}

or

# using subshell
function foo
{
  (
    for count in $(seq 10)
    do
      echo $count
    done
  )
}

Obvisously the version using the subshell is simpler to write because you don't have to care about declaring all variables local (not to mention (environment) variables created/exported by tools like getopts). But I could imagine that creating a subshell has an overhead.

So what is the better approach? What are the pros/cons?

Samovar answered 7/1, 2011 at 12:10 Comment(1)
"But I could imagine that creating a subshell has an overhead.", well run the time command over a 1000 tests and find out the overhead, I think it's small to non existent.Isfahan
C
10

Creating a sub-shell involves a fork(), so it definitely has overhead compared with a local variable. While sub-shells are cheap — you don't worry about their cost when you need one — they are not free.

If your script is going to be heavily used and performance really matters (so you'll have hundreds of users all running it at the same time, many times a day), then you might worry about the performance cost of the sub-shell. OTOH, if you run it once a month and the script as a whole runs for under 10 seconds, you probably wouldn't.

However, in terms of clarity, it is much better to be explicit and declare the variables — it reduces the risk of the script breaking because someone comes along and says "this sub-shell clearly isn't needed" (and it really isn't; I'd want to remove the sub-shells from your functions).

Look at the evolution of Perl scripts. They started off as a free-for-all with variables coming into existence on demand. They have gradually become more rigorous, with normal style now being to predeclare all variables. To some extent, shells have followed a similar path — but not as rigorously as Perl. Awk is also an interesting case study; its functions use global variables unless they are arguments to the function, which leads to functions being written with 3 active arguments (say) and 5 inactive arguments that effectively define local variables. It is slightly eccentric, though it 'works'.

Championship answered 7/1, 2011 at 15:41 Comment(0)
S
2

Now, making sure that all functions always declare all variables as local, is quite difficult.

I think this is very error-prone and prefer to always use subshell-functions:

f() (
 echo "we are a subshell"
)

No need to declare local variables - but also no way to change global variables. Which is GOOD in my opinion!

One additional consequence is, that you always need to check the return / exit code of such functions and act accordingly! This is, because you cannot exit you script from within a subshell function!

f() (
   echo "Trying to exit"
   exit 1
)

f
echo "Did not exit"

This will NOT exit your script. You need to do it this way:

f() (
   echo "Trying to exit"
   exit 1
)

f || exit $?
echo "Did not exit"

This will exit

Survey answered 2/6, 2016 at 11:10 Comment(1)
You can bypass always checking for return/exit codes by using set -e. Bash will now exit your script when a command fails, including when a subshell return a non-zero return/exit code.Snake

© 2022 - 2024 — McMap. All rights reserved.