Edit 2023: Add special characters and mapfile
...
Some Bash tricks I use to set variables from commands
Sorry, there is a loong answer. But as bash is a shell, where the main goal is to run other unix commands and react on result code and/or output ( commands are often piped filter, etc... ), storing command output in variables is something basic and fundamental.
Therefore, depending on
- compatibility (posix)
- kind of output (filter(s))
- number of variable to set (split or interpret)
- execution time (monitoring)
- error trapping
- repeatability of request (see long running background process, further)
- interactivity (considering user input while reading from another input file descriptor)
- parallelism (considering many inputs simultaneously, even interactively)
- handling of special characters. (New 2023)
- handling multiline fields in CSV files
- having to compute stats, rates, sums, or else, while reading datas
- having to track/retrieve handler, then search for them further in same stream (smtp mail server logs)
- do I miss something?
You could look at showCert
function - a complex sample, parsing openssl
output for building: 1 associative array for parsing SUBJECT
field, 1 standard array for parsing alternatives names
and storing dates to UNIXEPOCH. (Using a single fork to date
command for converting two dates together) - In How to determine SSL cert expiration date from a PEM certificate?
First simple, old (obsolete), and compatible way
myPi=`echo '4*a(1)' | bc -l`
echo $myPi
3.14159265358979323844
Compatible, second way
As nesting could become heavy, parenthesis was implemented for this
myPi=$(bc -l <<<'4*a(1)')
Using backticks in script is to be avoided today.
Nested sample:
SysStarted=$(date -d "$(ps ho lstart 1)" +%s)
echo $SysStarted
1480656334
bash features
Reading more than one variable (with Bashisms)
df -k /
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/dm-0 999320 529020 401488 57% /
If I just want a used value:
array=($(df -k /))
you could see an array variable:
declare -p array
declare -a array='([0]="Filesystem" [1]="1K-blocks" [2]="Used" [3]="Available" [
4]="Use%" [5]="Mounted" [6]="on" [7]="/dev/dm-0" [8]="999320" [9]="529020" [10]=
"401488" [11]="57%" [12]="/")'
Then:
echo ${array[9]}
529020
But I often use this:
{ read -r _;read -r filesystem size using avail prct mountpoint ; } < <(df -k /)
echo $using
529020
( The first read _
will just drop header line. ) Here, in only one command, you will populate 6 different variables (shown by alphabetical order):
declare -p avail filesystem mountpoint prct size using
declare -- avail="401488"
declare -- filesystem="/dev/dm-0"
declare -- mountpoint="/"
declare -- prct="57%"
declare -- size="999320"
declare -- using="529020"
Or
{ read -a head;varnames=(${head[@]//[K1% -]});varnames=(${head[@]//[K1% -]});
read ${varnames[@],,} ; } < <(LANG=C df -k /)
Then:
declare -p varnames ${varnames[@],,}
declare -a varnames=([0]="Filesystem" [1]="blocks" [2]="Used" [3]="Available" [4]="Use" [5]="Mounted" [6]="on")
declare -- filesystem="/dev/dm-0"
declare -- blocks="999320"
declare -- used="529020"
declare -- available="401488"
declare -- use="57%"
declare -- mounted="/"
declare -- on=""
Or even:
{ read _ ; read filesystem dsk[{6,2,9}] prct mountpoint ; } < <(df -k /)
declare -p mountpoint dsk
declare -- mountpoint="/"
declare -a dsk=([2]="529020" [6]="999320" [9]="401488")
(Note Used
and Blocks
is switched there: read ... dsk[6] dsk[2] dsk[9] ...
)
... will work with associative arrays too: read _ disk[total] disk[used] ...
More complex sample parsing Free
:
getFree() {
local -a hline sline;
{
read -ra hline;
sline=("${hline[@]::3}");
sline=("${sline[@]^}");
hline=("${hline[@]/%/]}") sline=("${sline[@]/%/]}");
read -r _ "${hline[@]/#/memInfos[}";
read -r _ "${sline[@]/#/memInfos[swap}"
} < <(LANG=C free --wide --kilo)
}
declare -A memInfos='()'
getFree
Then
declare -p memInfos
declare -A memInfos=([swapTotal]="104853" [cache]="246161" [free]="32518" [share
d]="925" [available]="238936" [used]="88186" [total]="386928" [swapFree]="78639"
[buffers]="20062" [swapUsed]="26214" )
So
for var in total used free shared buffers cache available; do
case $var in
tot*|use*|fre*) sval=${memInfos[swap${var^}]} ;;
*) sval='' ;;
esac
printf ' - %-12s %12s %12s\n' "$var" "${memInfos[$var]}" "$sval"
done
could produce something like:
- total 386928 104853
- used 88186 26214
- free 32518 78639
- shared 925
- buffers 20062
- cache 246161
- available 238936
Other related sample: Parsing xrandr
output: and end of Firefox tab by bash in a size of x% of display size? or at AskUbuntu.com Parsing xrandr
output
Dedicated fd
using unnamed fifo:
There is an elegent way! In this sample, I will read /etc/passwd
file:
users=()
while IFS=: read -u $list user pass uid gid name home bin ;do
((uid>=500)) &&
printf -v users[uid] "%11d %7d %-20s %s\n" $uid $gid $user $home
done {list}</etc/passwd
Using this way (... read -u $list; ... {list}<inputfile
) leave STDIN
free for other purposes, like user interaction.
Then
echo -n "${users[@]}"
1000 1000 user /home/user
...
65534 65534 nobody /nonexistent
and
echo ${!users[@]}
1000 ... 65534
echo -n "${users[1000]}"
1000 1000 user /home/user
This could be used with static files or even /dev/tcp/xx.xx.xx.xx/yyy
with x
for ip address or hostname and y
for port number or with the output of a command:
{
read -u $list -a head # read header in array `head`
varnames=(${head[@]//[K1% -]}) # drop illegal chars for variable names
while read -u $list ${varnames[@],,} ;do
((pct=available*100/(available+used),pct<10)) &&
printf "WARN: FS: %-20s on %-14s %3d <10 (Total: %11u, Use: %7s)\n" \
"${filesystem#*/mapper/}" "$mounted" $pct $blocks "$use"
done
} {list}< <(LANG=C df -k)
And of course with inline documents:
while IFS=\; read -u $list -a myvar ;do
echo ${myvar[2]}
done {list}<<"eof"
foo;bar;baz
alice;bob;charlie
$cherry;$strawberry;$memberberries
eof
Handling of special characters
A common problem is to correctly handling filenames (for sample) with
special characters like old latin encoding mixed with utf8 of worse (filename containing newline or tabulation).
For this, find
command could be run with -print0
for separating filenames found by null byte 0x00
.
To correctly handle this output with bash, you could:
while IFS='' read -r -d '' filename; do
size=$(stat -c %s "$filename")
printf ' %13d %q\n' $size "$filename"
done < <(
find . \( -type f -o -type d \) -print0
)
Handling of special characters by using mapfile
For small amount of entries, you could use mapfile
(or his sysnonyme: readarray
) in order to create an array, before
processing his elements:
mapfile -t -d '' entries < <( find . \( -type f -o -type d \) -print0)
for entry in "${entries[@]}";do
size=$(stat -c %s "$entry")
printf ' %13d %q\n' $size "$entry"
done
This could by used for splitting specials procfs
entries which are null
separated, like environ
file:
mapfile -d '' env_$$ </proc/$$/environ
declare -p ${!env_*}
Practical sample parsing CSV files:
As this answer is loong enough, for this paragraph,
I just will let you refer to
this answer to How to parse a CSV file in Bash?
, I read a file by using an unnamed fifo, using syntax like:
exec {FD}<"$file" # open unnamed fifo for read
IFS=',' read -ru $FD -a headline
while IFS=',' read -ru $FD -a row ;do ...
... But as CSV format could hold multiline fields, things are a little more complex! Using bash loadable CSV module, please have a look on Parsing CSV files under bash, using loadable module
On my website, you may find the same script, reading CSV
as inline document.
Sample function for populating some variables:
#!/bin/bash
declare free=0 total=0 used=0 mpnt='??'
getDiskStat() {
{
read _
read _ total used free _ mpnt
} < <(
df -k ${1:-/}
)
}
getDiskStat $1
echo "$mpnt: Tot:$total, used: $used, free: $free."
Nota: declare
line is not required, just for readability.
About sudo cmd | grep ... | cut ...
shell=$(cat /etc/passwd | grep $USER | cut -d : -f 7)
echo $shell
/bin/bash
(Please avoid useless cat
! So this is just one fork less:
shell=$(grep $USER </etc/passwd | cut -d : -f 7)
All pipes (|
) implies forks. Where another process have to be run, accessing disk, libraries calls and so on.
So using sed
for sample, will limit subprocess to only one fork:
shell=$(sed </etc/passwd "s/^$USER:.*://p;d")
echo $shell
And with Bashisms:
But for many actions, mostly on small files, Bash could do the job itself:
while IFS=: read -a line ; do
[ "$line" = "$USER" ] && shell=${line[6]}
done </etc/passwd
echo $shell
/bin/bash
or
while IFS=: read loginname encpass uid gid fullname home shell;do
[ "$loginname" = "$USER" ] && break
done </etc/passwd
echo $shell $loginname ...
Going further about variable splitting...
Have a look at my answer to How do I split a string on a delimiter in Bash?
Alternative: reducing forks by using backgrounded long-running tasks
In order to prevent multiple forks like
myPi=$(bc -l <<<'4*a(1)'
myRay=12
myCirc=$(bc -l <<<" 2 * $myPi * $myRay ")
or to obtain system start time and current shell start time, both as UNIX EPOCH, I could do two nested forks:
myStarted=$(date -d "$(ps ho lstart 1)" +%s)
mySessStart=$(date -d "$(ps ho lstart $$)" +%s)
This work fine, but running many forks is heavy and slow.
And commands like date
and bc
could make many operations, line by line!!
See:
bc -l <<<$'3*4\n5*6'
12
30
date -f - +%s < <(ps ho lstart 1 $$)
1516030449
1517853288
So building my two variables: $myStarted
and $mySessStart
could be done in one operation:
{
read -r myStarted
read -r mySessStart
} < <(
date -f - +%s < <(
ps ho lstart 1 $$
)
)
could be written on one line:
{ read -r myStarted;read -r mySessStart;}< <(date -f- +%s< <(ps ho lstart 1 $$))
Backgrounded tasks
But we could use a long running background process to make as many request we need, without having to initiate a new fork for each request.
You could have a look how reducing forks make Mandelbrot bash, improve from more than eight hours to less than five seconds.
Under bash, there is a built-in function: coproc
:
coproc bc -l
echo 4*3 >&${COPROC[1]}
read -u $COPROC answer
echo $answer
12
echo >&${COPROC[1]} 'pi=4*a(1)'
ray=42.0
printf >&${COPROC[1]} '2*pi*%s\n' $ray
read -u $COPROC answer
echo $answer
263.89378290154263202896
printf >&${COPROC[1]} 'pi*%s^2\n' $ray
read -u $COPROC answer
echo $answer
5541.76944093239527260816
As bc
is ready, running in background and I/O are ready too, there is no delay, nothing to load, open, close, before or after operation. Only the operation himself! This become a lot quicker than having to fork to bc
for each operation!
The little extra: (Little but powerful!) While bc
stay running, they will hold all his registers. So variables or functions could be defined at initialisation step, as first write to ${COPROC[1]}
, just after starting the task (... or even at any time).
Into a function newConnector
You may found my newConnector
function on GitHub.Com or on my own site (Note on GitHub: there are two files on my site. Function and demo are bundled into one unique file which could be sourced for use or just run for demo.)
Sample:
source shell_connector.sh
tty
/dev/pts/20
ps --tty pts/20 fw
PID TTY STAT TIME COMMAND
29019 pts/20 Ss 0:00 bash
30745 pts/20 R+ 0:00 \_ ps --tty pts/20 fw
newConnector /usr/bin/bc "-l" '3*4' 12
ps --tty pts/20 fw
PID TTY STAT TIME COMMAND
29019 pts/20 Ss 0:00 bash
30944 pts/20 S 0:00 \_ /usr/bin/bc -l
30952 pts/20 R+ 0:00 \_ ps --tty pts/20 fw
declare -p PI
bash: declare: PI: not found
myBc '4*a(1)' PI
declare -p PI
declare -- PI="3.14159265358979323844"
The function myBc
lets you use the background task with simple syntax.
Then for date:
newConnector /bin/date '-f - +%s' @0 0
myDate '2000-01-01'
946681200
myDate "$(ps ho lstart 1)" boottime
myDate now now
read utm idl </proc/uptime
myBc "$now-$boottime" uptime
printf "%s\n" ${utm%%.*} $uptime
42134906
42134906
ps --tty pts/20 fw
PID TTY STAT TIME COMMAND
29019 pts/20 Ss 0:00 bash
30944 pts/20 S 0:00 \_ /usr/bin/bc -l
32615 pts/20 S 0:00 \_ /bin/date -f - +%s
3162 pts/20 R+ 0:00 \_ ps --tty pts/20 fw
From there, if you want to end one of background processes, you just have to close its fd
:
eval "exec $DATEOUT>&-"
eval "exec $DATEIN>&-"
ps --tty pts/20 fw
PID TTY STAT TIME COMMAND
4936 pts/20 Ss 0:00 bash
5256 pts/20 S 0:00 \_ /usr/bin/bc -l
6358 pts/20 R+ 0:00 \_ ps --tty pts/20 fw
which is not needed, because all fd
close when the main process finishes.
echo
the variable is a useless use ofecho
, and a useless use of variables. – Soelchvariable=$(command)
but I think"$string"
is a validcommand
"; #37195295 – Soelch