Arrays in a POSIX compliant shell
Asked Answered
S

4

27

According to this reference sheet on hyperpolyglot.org, the following syntax can be used to set an array.

i=(1 2 3)

But I get an error with dash which is the default for /bin/sh on Ubuntu and should be POSIX compliant.

# Trying the syntax with dash in my terminal
> dash -i
$ i=(1 2 3)
dash: 1: Syntax error: "(" unexpected
$ exit

# Working fine with bash
> bash -i
$ i=(1 2 3)
$ echo ${i[@]}
1 2 3
$ exit

Is the reference sheet misleading or erroneous?
If yes, what would be the correct way to define an array or a list and be POSIX compliant?

Sorites answered 13/2, 2016 at 22:14 Comment(4)
There are no arrays in POSIX. If you look closely, that is in reference to a literal. That entire section of hyperpolyglot.org is just flat wrong (probably done by M$).Torpedoman
I do not understand what the sheet means by literal in this context, the section of the table is called resizable arrays. But even if it was not about array, it should execute correctly according to the reference sheet. But you are true, the important part is that there is no concept of arrays.Sorites
Thanks about that clarification, I was considering relying on this sheet, I will look for another then.Sorites
Ouch, that's a terrible "reference". It has non-POSIX function keyword, magic $RANDOM and echo -n, recommends trap exit ERR rather than a more useful trap 'exit 1' ERR, and is extremely reckless with quoting. Not recommended.Hel
L
34

Posix does not specify arrays, so if you are restricted to Posix shell features, you cannot use arrays.

I'm afraid your reference is mistaken. Sadly, not everything you find on the internet is correct.

Linette answered 13/2, 2016 at 22:16 Comment(4)
I read that too, but I needed a confirmation. The sheet seemed so detailed and precise that I was confusedSorites
@Sorites Vendors of swampland in Florida also offer detailed and precise survey reports.Linette
Yes, but a common confusion is when a shell like bash is running in POSIX mode, e.g. with --posix invocation, or #!/bin/sh shebang, then it will quietly understand this non-POSIXism.Lilybelle
This is only partially true. POSIX does specify the list of arguments, which can be used as $@ which can be adjusted with shift and set as noted in my answer.Projectionist
I
27

As said by rici, dash doesn't have array support. However, there are workarounds if what you're looking to do is write a loop.

For loop won't do arrays, but you can do the splitting using a while loop + the read builtin. Since the dash read builtin also doesn't support delimiters, you would have to work around that too.

Here's a sample script:

myArray="a b c d"

echo "$myArray" | tr ' ' '\n' | while read item; do
  # use '$item'
  echo $item
done

Some deeper explanation on that:

  • The tr ' ' '\n' will let you do a single-character replace where you remove the spaces & add newlines - which are the default delim for the read builtin.

  • read will exit with a failing exit code when it detects that stdin has been closed - which would be when your input has been fully processed.

  • Since echo prints an extra newline after its input, that will let you process the last "element" in your array.

This would be equivalent to the bash code:

myArray=(a b c d)

for item in ${myArray[@]}; do
  echo $item
done

If you want to retrieve the n-th element (let's say 2-th for the purpose of the example):

myArray="a b c d"

echo $myArray | cut -d\  -f2 # change -f2 to -fn
Impeller answered 14/6, 2017 at 19:38 Comment(5)
Thanks! The last code snippet was just what I was looking for :)Mignonne
The issue with this is that it tries to store several separate strings inside one single string. As soon as you want to store strings that contains whitespace characters, you will fail to retrieve them properly. You would have to use some other safe delimiter other than space, and then implement your own parsing of the string.Isooctane
if the array contains file path then space may be used in any file name but you may use other symbol e.g. pipe |. So instead of myArray="a b c d" you can try myArray="a|b|c|d" and then change tr ' ' '\n' to tr '|' '\n'. I tested and it works fineLindquist
shellcheck seems to suggest that it would be beneficial to have while read -r item; do to prevent the mangling of any backslash characters.Louisiana
Why are you using the external command tr for this? You're already losing spacing, why not just use for item in $myArray (note the lack of quotes)? If you want to preserve spaces, put it in a function and locally change $IFS to the desired delimiter (like local IFS='|') ... or use $@ (see my answer).Projectionist
I
20

It is true that the POSIX sh shell does not have named arrays in the same sense that bash and other shells have, but there is a list that sh shells (as well as bash and others) could use, and that's the list of positional parameters.

This list usually contains the arguments passed to the current script or shell function, but you can set its values with the set built-in command:

#!/bin/sh

set -- this is "a list" of "several strings"

In the above script, the positional parameters $1, $2, ..., are set to the five string shown. The -- is used to make sure that you don't unexpectedly set a shell option (which the set command is also able to do). This is only ever an issue if the first argument starts with a - though.

To e.g. loop over these strings, you can use

for string in "$@"; do
    printf 'Got the string "%s"\n' "$string"
done

or the shorter

for string do
    printf 'Got the string "%s"\n' "$string"
done

or just

printf 'Got the string "%s"\n' "$@"

set is also useful for expanding globs into lists of pathnames:

#!/bin/sh

set -- "$HOME"/*/

# "visible directory" below really means "visible directory, or visible 
# symbolic link to a directory".

if [ ! -d "$1" ]; then
    echo 'You do not have any visible directories in your home directory'
else
    printf 'There are %d visible directories in your home directory\n' "$#"

    echo 'These are:'
    printf '\t%s\n' "$@"
fi

The shift built-in command can be used to shift off the first positional parameter from the list.

#!/bin/sh

# pathnames
set -- path/name/1 path/name/2 some/other/pathname

# insert "--exclude=" in front of each
for pathname do
    shift
    set -- "$@" --exclude="$pathname"
done

# call some command with our list of command line options
some_command "$@"

Isooctane answered 16/12, 2019 at 9:38 Comment(3)
Is there something like an unshift in POSIX? Can you use $@ as a stack somehow?Libertylibia
@Libertylibia To add at the start: set -- "$item" "$@". To add at the end: set -- "$@" "$item". You can't easily delete the last element of $@, but you can delete the first with shift. A stack could be implemented by pushing/popping the first element.Isooctane
Deleting the last element of $@ is only mildly cumbersome. I've added an answer that demonstrates shift, unshift, push, pop (which performs that deletion), and other array functions for $@.Projectionist
P
4

You can use the argument list $@ as an array in POSIX shells

It's trivial to initialize, shift, unshift, and push:

# initialize $@ containing a string, a variable's value, and a glob's matches
set -- "item 1" "$variable" *.wav
# shift (remove first item, accepts a numeric argument to remove more)
shift
# unshift (prepend new first item)
set -- "new item" "$@"
# push (append new last item)
set -- "$@" "new item"

Here's a pop implementation:

# pop (remove last item, store it in $last)
i=0 len=$#
for last in "$@"; do 
  if [ $((i+=1)) = 1 ]; then set --; fi  # increment $i. first run: empty $@
  if [ $i = $len ]; then break; fi       # stop before processing the last item
  set -- "$@" "$last"                    # add $a back to $@
done
echo "$last has been removed from ($*)"

($* joins the contents of $@ with $IFS, which defaults to a space character.)

Iterate through the $@ array and modify some of its contents:

i=0
for a in "$@"; do 
  if [ $((i+=1)) = 1 ]; then set --; fi  # increment $i. first run: empty $@
  a="${a%.*}.mp3"       # example tweak to $a: change extension to .mp3
  set -- "$@" "$a"      # add $a back to $@
done

Refer to items in the $@ array:

echo "$1 is the first item"
echo "$# is the length of the array"
echo "all items in the array (properly quoted): $@"
echo "all items in the array (in a string): $*"
[ "$n" -ge 0 ] && eval "echo \"the ${n}th item in the array is \$$n\""

(eval is dangerous, so I've ensured $n is a number before running it)

There are a few ways to set $last to the final item of a list without popping it:
with a function:

last_item() { shift $(($# - 1)) 2>/dev/null && printf %s "$1"; }
last="$(last_item "$@")"

... or with an eval (safe since $# is always a number):

eval last="\$$#"

... or with a loop:

for last in "$@"; do true; done

⚠️ Warning: Functions have their own $@ arrays. You'll have to pass it to the function, like my_function "$@" if read-only or else set -- $(my_function "$@") if you want to manipulate $@ and don't expect spaces in item values.

If you need to handle spaces in item values, it becomes much more cumbersome:

# ensure my_function() returns each list item on its own line
i=1
my_function "$@" |while IFS= read line; do
  if [ $i = 1 ]; then unset i; set --; fi
  set -- "$@" "$line"
done

This still won't work with newlines in your items. You'd have to escape them to another character (but not null) and then escape them back later. See "Iterate through the $@ array and modify some of its contents" above. You can either iterate through the array in a for loop and then run the function, then modify the variables in a while IFS= read line loop, or just do it all in a for loop without a function.

Projectionist answered 13/2, 2023 at 21:54 Comment(3)
for i in "$@"; do ...; done can be shortened into for i do ...; done. This relieves the user from remembering to quote $@.Isooctane
@Isooctane – Most of the quotes you added to my answer were unnecessary (numbers never need to be quoted; if you change $IFS to include a digit, you're not going to like the fallout). You removed quotes from an instance that needed them. I have reverted most of your changes. Yes, for i do …; done is shorter and POSIX-compliant, but I do not consider it to be intuitive.Projectionist
Quotes are not needed on the right-hand side of assignments since the shell won't perform splitting or globbing there. If you remove quoting on expansions just because they contain digits, you'd better add that you assume IFS can never contain digits or just explicitly reset IFS to its default value. You gain very little by refusing to quote those expansions.Isooctane

© 2022 - 2024 — McMap. All rights reserved.