How to modify a global variable within a function in bash?
Asked Answered
K

11

164

I'm working with this:

GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)

I have a script like below:

#!/bin/bash

e=2

function test1() {
  e=4
  echo "hello"
}

test1 
echo "$e"

Which returns:

hello
4

But if I assign the result of the function to a variable, the global variable e is not modified:

#!/bin/bash

e=2

function test1() {
  e=4
  echo "hello"
}

ret=$(test1)

echo "$ret"
echo "$e"

Returns:

hello
2

I've heard of the use of eval in this case, so I did this in test1:

eval 'e=4'

But the same result.

Could you explain me why it is not modified? How could I save the echo of the test1 function in ret and modify the global variable too?

Kotto answered 9/5, 2014 at 12:44 Comment(1)
Do you need to return hello ? You could just echo $e for it to return. Or echo everything you want and then parse the result ?Earthshine
L
157

When you use a command substitution (i.e., the $(...) construct), you are creating a subshell. Subshells inherit variables from their parent shells, but this only works one way: A subshell cannot modify the environment of its parent shell.

Your variable e is set within a subshell, but not the parent shell. There are two ways to pass values from a subshell to its parent. First, you can output something to stdout, then capture it with a command substitution:

myfunc() {
    echo "Hello"
}

var="$(myfunc)"

echo "$var"

The above outputs:

Hello

For a numerical value in the range of 0 through 255, you can use return to pass the number as the exit status:

mysecondfunc() {
    echo "Hello"
    return 4
}

var="$(mysecondfunc)"
num_var=$?

echo "$var - num is $num_var"

This outputs:

Hello - num is 4
Liquidambar answered 9/5, 2014 at 12:56 Comment(3)
Thanks for the point, but I have to return an string array, and within the function I have to add elements to two global string arrays.Kotto
You realise that if you just run the function without assigning it to a variable all the global variables within it will update. Instead of returning a string array, why not just update the string array in the function then assign it to another variable after the function has finished ?Earthshine
@JohnDoe: You can't return a "string array" from a function. All you can do is print a string. However, you can do something like this: setarray() { declare -ag "$1=(a b c)"; }Champaign
L
64

This needs bash 4.1 if you use {fd} or local -n.

The rest should work in bash 3.x I hope. I am not completely sure due to printf %q - this might be a bash 4 feature.

Summary

Your example can be modified as follows to archive the desired effect:

# Add following 4 lines:
_passback() { while [ 1 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; return $1; }
passback() { _passback "$@" "$?"; }
_capture() { { out="$("${@:2}" 3<&-; "$2_" >&3)"; ret=$?; printf "%q=%q;" "$1" "$out"; } 3>&1; echo "(exit $ret)"; }
capture() { eval "$(_capture "$@")"; }

e=2

# Add following line, called "Annotation"
function test1_() { passback e; }
function test1() {
  e=4
  echo "hello"
}

# Change following line to:
capture ret test1 

echo "$ret"
echo "$e"

prints as desired:

hello
4

Note that this solution:

  • Works for e=1000, too.
  • Preserves $? if you need $?

The only bad sideffects are:

  • It needs a modern bash.
  • It forks quite more often.
  • It needs the annotation (named after your function, with an added _)
  • It sacrifices file descriptor 3.
    • You can change it to another FD if you need that.
      • In _capture just replace all occurances of 3 with another (higher) number.

The following (which is quite long, sorry for that) hopefully explains, how to adpot this recipe to other scripts, too.

The problem

d() { let x++; date +%Y%m%d-%H%M%S; }

x=0
d1=$(d)
d2=$(d)
d3=$(d)
d4=$(d)
echo $x $d1 $d2 $d3 $d4

outputs

0 20171129-123521 20171129-123521 20171129-123521 20171129-123521

while the wanted output is

4 20171129-123521 20171129-123521 20171129-123521 20171129-123521

The cause of the problem

Shell variables (or generally speaking, the environment) is passed from parental processes to child processes, but not vice versa.

If you do output capturing, this usually is run in a subshell, so passing back variables is difficult.

Some even tell you, that it is impossible to fix. This is wrong, but it is a long known difficult to solve problem.

There are several ways on how to solve it best, this depends on your needs.

Here is a step by step guide on how to do it.

Passing back variables into the parental shell

There is a way to pass back variables to a parental shell. However this is a dangerous path, because this uses eval. If done improperly, you risk many evil things. But if done properly, this is perfectly safe, provided that there is no bug in bash.

_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }

d() { let x++; d=$(date +%Y%m%d-%H%M%S); _passback x d; }

x=0
eval `d`
d1=$d
eval `d`
d2=$d
eval `d`
d3=$d
eval `d`
d4=$d
echo $x $d1 $d2 $d3 $d4

prints

4 20171129-124945 20171129-124945 20171129-124945 20171129-124945

Note that this works for dangerous things, too:

danger() { danger="$*"; passback danger; }
eval `danger '; /bin/echo *'`
echo "$danger"

prints

; /bin/echo *

This is due to printf '%q', which quotes everything such, that you can re-use it in a shell context safely.

But this is a pain in the a..

This does not only look ugly, it also is much to type, so it is error prone. Just one single mistake and you are doomed, right?

Well, we are at shell level, so you can improve it. Just think about an interface you want to see, and then you can implement it.

Augment, how the shell processes things

Let's go a step back and think about some API which allows us to easily express, what we want to do.

Well, what do we want do do with the d() function?

We want to capture the output into a variable. OK, then let's implement an API for exactly this:

# This needs a modern bash 4.3 (see "help declare" if "-n" is present,
# we get rid of it below anyway).
: capture VARIABLE command args..
capture()
{
local -n output="$1"
shift
output="$("$@")"
}

Now, instead of writing

d1=$(d)

we can write

capture d1 d

Well, this looks like we haven't changed much, as, again, the variables are not passed back from d into the parent shell, and we need to type a bit more.

However now we can throw the full power of the shell at it, as it is nicely wrapped in a function.

Think about an easy to reuse interface

A second thing is, that we want to be DRY (Don't Repeat Yourself). So we definitively do not want to type something like

x=0
capture1 x d1 d
capture1 x d2 d
capture1 x d3 d
capture1 x d4 d
echo $x $d1 $d2 $d3 $d4

The x here is not only redundant, it's error prone to always repeate in the correct context. What if you use it 1000 times in a script and then add a variable? You definitively do not want to alter all the 1000 locations where a call to d is involved.

So leave the x away, so we can write:

_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }

d() { let x++; output=$(date +%Y%m%d-%H%M%S); _passback output x; }

xcapture() { local -n output="$1"; eval "$("${@:2}")"; }

x=0
xcapture d1 d
xcapture d2 d
xcapture d3 d
xcapture d4 d
echo $x $d1 $d2 $d3 $d4

outputs

4 20171129-132414 20171129-132414 20171129-132414 20171129-132414

This already looks very good. (But there still is the local -n which does not work in oder common bash 3.x)

Avoid changing d()

The last solution has some big flaws:

  • d() needs to be altered
  • It needs to use some internal details of xcapture to pass the output.
    • Note that this shadows (burns) one variable named output, so we can never pass this one back.
  • It needs to cooperate with _passback

Can we get rid of this, too?

Of course, we can! We are in a shell, so there is everything we need to get this done.

If you look a bit closer to the call to eval you can see, that we have 100% control at this location. "Inside" the eval we are in a subshell, so we can do everything we want without fear of doing something bad to the parental shell.

Yeah, nice, so let's add another wrapper, now directly inside the eval:

_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }
# !DO NOT USE!
_xcapture() { "${@:2}" > >(printf "%q=%q;" "$1" "$(cat)"); _passback x; }  # !DO NOT USE!
# !DO NOT USE!
xcapture() { eval "$(_xcapture "$@")"; }

d() { let x++; date +%Y%m%d-%H%M%S; }

x=0
xcapture d1 d
xcapture d2 d
xcapture d3 d
xcapture d4 d
echo $x $d1 $d2 $d3 $d4

prints

4 20171129-132414 20171129-132414 20171129-132414 20171129-132414                                                    

However, this, again, has some major drawback:

  • The !DO NOT USE! markers are there, because there is a very bad race condition in this, which you cannot see easily:
    • The >(printf ..) is a background job. So it might still execute while the _passback x is running.
    • You can see this yourself if you add a sleep 1; before printf or _passback. _xcapture a d; echo then outputs x or a first, respectively.
  • The _passback x should not be part of _xcapture, because this makes it difficult to reuse that recipe.
  • Also we have some unneded fork here (the $(cat)), but as this solution is !DO NOT USE! I took the shortest route.

However, this shows, that we can do it, without modification to d() (and without local -n)!

Please note that we not neccessarily need _xcapture at all, as we could have written everyting right in the eval.

However doing this usually isn't very readable. And if you come back to your script in a few years, you probably want to be able to read it again without much trouble.

Fix the race

Now let's fix the race condition.

The trick could be to wait until printf has closed it's STDOUT, and then output x.

There are many ways to archive this:

  • You cannot use shell pipes, because pipes run in different processes.
  • One can use temporary files,
  • or something like a lock file or a fifo. This allows to wait for the lock or fifo,
  • or different channels, to output the information, and then assemble the output in some correct sequence.

Following the last path could look like (note that it does the printf last because this works better here):

_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }

_xcapture() { { printf "%q=%q;" "$1" "$("${@:2}" 3<&-; _passback x >&3)"; } 3>&1; }

xcapture() { eval "$(_xcapture "$@")"; }

d() { let x++; date +%Y%m%d-%H%M%S; }

x=0
xcapture d1 d
xcapture d2 d
xcapture d3 d
xcapture d4 d
echo $x $d1 $d2 $d3 $d4

outputs

4 20171129-144845 20171129-144845 20171129-144845 20171129-144845

Why is this correct?

  • _passback x directly talks to STDOUT.
  • However, as STDOUT needs to be captured in the inner command, we first "save" it into FD3 (you can use others, of course) with '3>&1' and then reuse it with >&3.
  • The $("${@:2}" 3<&-; _passback x >&3) finishes after the _passback, when the subshell closes STDOUT.
  • So the printf cannot happen before the _passback, regardless how long _passback takes.
  • Note that the printf command is not executed before the complete commandline is assembled, so we cannot see artefacts from printf, independently how printf is implemented.

Hence first _passback executes, then the printf.

This resolves the race, sacrificing one fixed file descriptor 3. You can, of course, choose another file descriptor in the case, that FD3 is not free in your shellscript.

Please also note the 3<&- which protects FD3 to be passed to the function.

Make it more generic

_capture contains parts, which belong to d(), which is bad, from a reusability perspective. How to solve this?

Well, do it the desparate way by introducing one more thing, an additional function, which must return the right things, which is named after the original function with _ attached.

This function is called after the real function, and can augment things. This way, this can be read as some annotation, so it is very readable:

_passback() { while [ 0 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; }
_capture() { { printf "%q=%q;" "$1" "$("${@:2}" 3<&-; "$2_" >&3)"; } 3>&1; }
capture() { eval "$(_capture "$@")"; }

d_() { _passback x; }
d() { let x++; date +%Y%m%d-%H%M%S; }

x=0
capture d1 d
capture d2 d
capture d3 d
capture d4 d
echo $x $d1 $d2 $d3 $d4

still prints

4 20171129-151954 20171129-151954 20171129-151954 20171129-151954

Allow access to the return-code

There is only on bit missing:

v=$(fn) sets $? to what fn returned. So you probably want this, too. It needs some bigger tweaking, though:

# This is all the interface you need.
# Remember, that this burns FD=3!
_passback() { while [ 1 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; return $1; }
passback() { _passback "$@" "$?"; }
_capture() { { out="$("${@:2}" 3<&-; "$2_" >&3)"; ret=$?; printf "%q=%q;" "$1" "$out"; } 3>&1; echo "(exit $ret)"; }
capture() { eval "$(_capture "$@")"; }

# Here is your function, annotated with which sideffects it has.
fails_() { passback x y; }
fails() { x=$1; y=69; echo FAIL; return 23; }

# And now the code which uses it all
x=0
y=0
capture wtf fails 42
echo $? $x $y $wtf

prints

23 42 69 FAIL

There is still a lot room for improvement

  • _passback() can be elmininated with passback() { set -- "$@" "$?"; while [ 1 -lt $# ]; do printf '%q=%q;' "$1" "${!1}"; shift; done; return $1; }

  • _capture() can be eliminated with capture() { eval "$({ out="$("${@:2}" 3<&-; "$2_" >&3)"; ret=$?; printf "%q=%q;" "$1" "$out"; } 3>&1; echo "(exit $ret)")"; }

  • The solution pollutes a file descriptor (here 3) by using it internally. You need to keep that in mind if you happen to pass FDs.
    Note thatbash 4.1 and above has {fd} to use some unused FD.
    (Perhaps I will add a solution here when I come around.)
    Note that this is why I use to put it in separate functions like _capture, because stuffing this all into one line is possible, but makes it increasingly harder to read and understand

  • Perhaps you want to capture STDERR of the called function, too. Or you want to even pass in and out more than one filedescriptor from and to variables.
    I have no solution yet, however here is a way to catch more than one FD, so we can probably pass back the variables this way, too.

Also do not forget:

This must call a shell function, not an external command.

There is no easy way to pass environment variables out of external commands. (With LD_PRELOAD= it should be possible, though!) But this then is something completely different.

Last words

This is not the only possible solution. It is one example to a solution.

As always you have many ways to express things in the shell. So feel free to improve and find something better.

The solution presented here is quite far from being perfect:

  • It was nearly not tested at all, so please forgive typos.
  • There is a lot of room for improvement, see above.
  • It uses many features from modern bash, so probably is hard to port to other shells.
  • And there might be some quirks I haven't thought about.

However I think it is quite easy to use:

  • Add just 4 lines of "library".
  • Add just 1 line of "annotation" for your shell function.
  • Sacrifices just one file descriptor temporarily.
  • And each step should be easy to understand even years later.
Lichtenfeld answered 29/11, 2017 at 15:27 Comment(2)
you are awesomeFlybynight
never in my life I have seen such an extensive reply taken from so many angles. I bow to you @LichtenfeldForest
W
16

Maybe you can use a file, write to file inside function, read from file after it. I have changed e to an array. In this example blanks are used as separator when reading back the array.

#!/bin/bash

declare -a e
e[0]="first"
e[1]="secondddd"

function test1 () {
 e[2]="third"
 e[1]="second"
 echo "${e[@]}" > /tmp/tempout
 echo hi
}

ret=$(test1)

echo "$ret"

read -r -a e < /tmp/tempout
echo "${e[@]}"
echo "${e[0]}"
echo "${e[1]}"
echo "${e[2]}"

Output:

hi
first second third
first
second
third
Worldweary answered 9/5, 2014 at 14:34 Comment(1)
Wow, a lot of info. I get: passback: command not found . :(Monomorphic
V
15

What you are doing, you are executing test1

$(test1)

in a sub-shell( child shell ) and Child shells cannot modify anything in parent.

You can find it in bash manual

Please Check: Things results in a subshell here

Viborg answered 9/5, 2014 at 12:58 Comment(0)
P
12

I had a similar problem when I wanted to remove temporary files I had created automatically. The solution I came up with was not to use command substitution, but rather to pass the name of the variable, that should take the final result, into the function. E.g.

#!/usr/bin/env bash

# array that keeps track of tmp-files
remove_later=()

# function that manages tmp-files
new_tmp_file() {
  file=$(mktemp)
  remove_later+=( "$file" )
  # assign value (safe form of `eval "$1=$file"`)
  printf -v "$1" -- "$file"
}

# function to remove all tmp-files
remove_tmp_files() { rm -- "${remove_later[@]}"; }

# define trap to remove all tmp-files upon EXIT
trap remove_tmp_files EXIT

# generate tmp-files
new_tmp_file tmpfile1
new_tmp_file tmpfile2

So, adapting this to the OP, it would be:

#!/usr/bin/env bash
    
e=2
    
function test1() {
  e=4
  printf -v "$1" -- "hello"
}
    
test1 ret
    
echo "$ret"
echo "$e"

Works and has no restrictions on the "return value".

Papacy answered 15/1, 2015 at 21:47 Comment(3)
This is an underappreciated solution.Cacilie
A couple of modifications I would implement: (1) avoid the usage of eval and use printf -v "$1" -- "%s" "$file" instead. (2) define remove_later to be an array declare -a remove_later and use it accordingly.Cacilie
Thank you very much @Cacilie ! I fully agree with your proposals, especially the array solution, which is much cleaner. Do you want to make the changes? I don't get around to it currently...Papacy
E
4

Assuming that local -n is available, the following script lets the function test1 modify a global variable:

#!/bin/bash

e=2

function test1() {
  local -n var=$1
  var=4
  echo "hello"
}

test1 e
echo "$e"

Which gives the following output:

hello
4
Economic answered 11/3, 2021 at 8:31 Comment(0)
R
2

I'm not sure if this works on your terminal, but I found out that if you don't provide any outputs whatsoever it gets naturally treated as a void function, and can make global variable changes. Here's the code I used:

let ran1=$(( (1<<63)-1)/3 ))
let ran2=$(( (1<<63)-1)/5 ))
let c=0
function randomize {
    c=$(( ran1+ran2 ))
    ran2=$ran1
    ran1=$c
    c=$(( c > 0 ))
}

It's a simple randomizer for games that effectively modifies the needed variables.

Refractive answered 3/11, 2021 at 13:9 Comment(0)
O
1

It's because command substitution is performed in a subshell, so while the subshell inherits the variables, changes to them are lost when the subshell ends.

Reference:

Command substitution, commands grouped with parentheses, and asynchronous commands are invoked in a subshell environment that is a duplicate of the shell environment

Overtrade answered 9/5, 2014 at 12:51 Comment(2)
@JohnDoe I'm not sure it's possible. You might have to rethink your design of the script.Overtrade
Oh, but I need to assing a global array within a function, if not, I'd have to repeat a lot of code (repeat the code of the function -30 lines- 15 times -one per call-). There is no other way, isn't it?Kotto
O
1

A solution to this problem, without having to introduce complex functions and heavily modify the original one, is to store the value in a temporary file and read / write it when needed.

This approach helped me greatly when I had to mock a bash function called multiple times in a bats test case.

For example, you could have:

# Usage read_value path_to_tmp_file
function read_value {
  cat "${1}"
}

# Usage: set_value path_to_tmp_file the_value
function set_value {
  echo "${2}" > "${1}"
}
#----

# Original code:

function test1() {
  e=4
  set_value "${tmp_file}" "${e}"
  echo "hello"
}


# Create the temp file
# Note that tmp_file is available in test1 as well
tmp_file=$(mktemp)

# Your logic
e=2
# Store the value
set_value "${tmp_file}" "${e}"

# Run test1
test1

# Read the value modified by test1
e=$(read_value "${tmp_file}")
echo "$e"

The drawback is that you might need multiple temp files for different variables. And also you might need to issue a sync command to persist the contents on the disk between one write and read operations.

Oscitancy answered 23/4, 2020 at 13:15 Comment(0)
I
0

Rather than fight with bash to have one function modify a global variable and also return another value, make two functions: one for modifying the global variable and another to return a value.

#!/bin/bash

e=2

function test1() {
  echo "hello"
}

function test2() {
  e2=4
  echo $e2
}

ret=$(test1)
e=$(test2)

echo "$ret"
echo "$e"
Irritability answered 21/9, 2023 at 13:41 Comment(0)
V
-1

You can always use an alias:

alias next='printf "blah_%02d" $count;count=$((count+1))'
Vuillard answered 27/5, 2017 at 16:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.