When do I set IFS to a newline in Bash?
Asked Answered
D

2

9

I thought setting IFS to $'\n' would help me in reading an entire file into an array, as in:

IFS=$'\n' read -r -a array < file

However, the above command only reads the first line of the file into the first element of the array, and nothing else.

Even this reads only the first line into the array:

string=$'one\ntwo\nthree'
IFS=$'\n' read -r -a array <<< "$string"

I came across other posts on this site that talk about either using mapfile -t or a read loop to read a file into an array.

Now my question is: when do I use IFS=$'\n' at all?

Dexedrine answered 6/2, 2017 at 5:5 Comment(1)
Related: What is the exact meaning of IFS=$'\n'?.Balladist
F
6

Your second try almost works, but you have to tell read that it should not just read until newline (the default behaviour), but for example until the null string:

$ IFS=$'\n' read -a arr -d '' <<< $'a b c\nd e f\ng h i'
$ declare -p arr
declare -a arr='([0]="a b c" [1]="d e f" [2]="g h i")'

But as you pointed out, mapfile/readarray is the way to go if you have it (requires Bash 4.0 or newer):

$ mapfile -t arr <<< $'a b c\nd e f\ng h i'
$ declare -p arr
declare -a arr='([0]="a b c" [1]="d e f" [2]="g h i")'

The -t option removes the newlines from each element.

As for when you'd want to use IFS=$'\n':

  • As just shown, if you want to read a files into an array, one line per element, if your Bash is older than 4.0, and you don't want to use a loop
  • Some people promote using an IFS without a space to avoid unexpected side effects from word splitting; the proper approach in my opinion, though, is to understand word splitting and make sure to avoid it with proper quoting as desired.
  • I've seen IFS=$'\n' used in tab completion scripts, for example the one for cd in bash-completion: this script fiddles with paths and replaces colons with newlines, to then split them up using that IFS.
Frumpish answered 6/2, 2017 at 5:30 Comment(0)
B
18

You are a bit confused as to what IFS is. IFS is the Internal Field Separator used by bash to perform word-splitting to split lines into words after expansion. The default value is [ \t\n] (space, tab, newline).

By reassigning IFS=$'\n', you are removing the ' \t' and telling bash to only split words on newline characters (your thinking is correct). That has the effect of allowing some line with spaces to be read into a single array element without quoting.

Where your implementation fails is in your read -r -a array < file. The -a causes words in the line to be assigned to sequential array indexes. However, you have told bash to only break on a newline (which is the whole line). Since you only call read once, only one array index is filled.

You can either do:

while IFS=$'\n' read -r line; do
    array+=( $line )
done < "$filename"

(which you could do without changing IFS if you simply quoted "$line")

Or using IFS=$'\n', you could do

IFS=$'\n'
array=( $(<filename) )

or finally, you could use IFS and readarray:

readarray array <filename

Try them and let me know if you have questions.

Bullfighter answered 6/2, 2017 at 5:36 Comment(3)
Shouldn't array=$( $(<filename) ) be array=( $(<filename) )?Frumpish
Thank you both - Benjamin & David. Your answers have helped me immensely.Dexedrine
Glad we could help. Between both answers, you should have a good feel for IFS now. It may take a bit longer to get comfortable with the terms word-splitting and expansion and how bash applies them -- that's normal, it will sink in over time. Don't forget man bash, it has everything there, it just takes a while to figure out how to read it comfortably. When you can find what you need with a simple man bash and /searchterm, you are well on your way to mastering shell. Good luck with your scripting :).Bullfighter
F
6

Your second try almost works, but you have to tell read that it should not just read until newline (the default behaviour), but for example until the null string:

$ IFS=$'\n' read -a arr -d '' <<< $'a b c\nd e f\ng h i'
$ declare -p arr
declare -a arr='([0]="a b c" [1]="d e f" [2]="g h i")'

But as you pointed out, mapfile/readarray is the way to go if you have it (requires Bash 4.0 or newer):

$ mapfile -t arr <<< $'a b c\nd e f\ng h i'
$ declare -p arr
declare -a arr='([0]="a b c" [1]="d e f" [2]="g h i")'

The -t option removes the newlines from each element.

As for when you'd want to use IFS=$'\n':

  • As just shown, if you want to read a files into an array, one line per element, if your Bash is older than 4.0, and you don't want to use a loop
  • Some people promote using an IFS without a space to avoid unexpected side effects from word splitting; the proper approach in my opinion, though, is to understand word splitting and make sure to avoid it with proper quoting as desired.
  • I've seen IFS=$'\n' used in tab completion scripts, for example the one for cd in bash-completion: this script fiddles with paths and replaces colons with newlines, to then split them up using that IFS.
Frumpish answered 6/2, 2017 at 5:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.