How can I store the "find" command results as an array in Bash
Asked Answered
S

8

169

I am trying to save the result from find as arrays. Here is my code:

#!/bin/bash

echo "input : "
read input

echo "searching file with this pattern '${input}' under present directory"
array=`find . -name ${input}`

len=${#array[*]}
echo "found : ${len}"

i=0

while [ $i -lt $len ]
do
echo ${array[$i]}
let i++
done

I get 2 .txt files under current directory. So I expect '2' as result of ${len}. However, it prints 1. The reason is that it takes all result of find as one elements. How can I fix this?

P.S
I found several solutions on StackOverFlow about a similar problem. However, they are a little bit different so I can't apply in my case. I need to store the results in a variable before the loop. Thanks again.

Shick answered 29/4, 2014 at 6:7 Comment(0)
T
203

Update 2020 for Linux Users:

If you have an up-to-date version of bash (4.4-alpha or better), as you probably do if you are on Linux, then you should be using Benjamin W.'s answer.

If you are on Mac OS, which —last I checked— still used bash 3.2, or are otherwise using an older bash, then continue on to the next section.

Answer for bash 4.3 or earlier

Here is one solution for getting the output of find into a bash array:

array=()
while IFS=  read -r -d $'\0'; do
    array+=("$REPLY")
done < <(find . -name "${input}" -print0)

This is tricky because, in general, file names can have spaces, new lines, and other script-hostile characters. The only way to use find and have the file names safely separated from each other is to use -print0 which prints the file names separated with a null character. This would not be much of an inconvenience if bash's readarray/mapfile functions supported null-separated strings but they don't. Bash's read does and that leads us to the loop above.

[This answer was originally written in 2014. If you have a recent version of bash, please see the update below.]

How it works

  1. The first line creates an empty array: array=()

  2. Every time that the read statement is executed, a null-separated file name is read from standard input. The -r option tells read to leave backslash characters alone. The -d $'\0' tells read that the input will be null-separated. Since we omit the name to read, the shell puts the input into the default name: REPLY.

  3. The array+=("$REPLY") statement appends the new file name to the array array.

  4. The final line combines redirection and command substitution to provide the output of find to the standard input of the while loop.

Why use process substitution?

If we didn't use process substitution, the loop could be written as:

array=()
find . -name "${input}" -print0 >tmpfile
while IFS=  read -r -d $'\0'; do
    array+=("$REPLY")
done <tmpfile
rm -f tmpfile

In the above the output of find is stored in a temporary file and that file is used as standard input to the while loop. The idea of process substitution is to make such temporary files unnecessary. So, instead of having the while loop get its stdin from tmpfile, we can have it get its stdin from <(find . -name ${input} -print0).

Process substitution is widely useful. In many places where a command wants to read from a file, you can specify process substitution, <(...), instead of a file name. There is an analogous form, >(...), that can be used in place of a file name where the command wants to write to the file.

Like arrays, process substitution is a feature of bash and other advanced shells. It is not part of the POSIX standard.

Alternative: lastpipe

If desired, lastpipe can be used instead of process substitution (hat tip: Caesar):

set +m
shopt -s lastpipe
array=()
find . -name "${input}" -print0 | while IFS=  read -r -d $'\0'; do array+=("$REPLY"); done; declare -p array

shopt -s lastpipe tells bash to run the last command in the pipeline in the current shell (not the background). This way, the array remains in existence after the pipeline completes. Because lastpipe only takes effect if job control is turned off, we run set +m. (In a script, as opposed to the command line, job control is off by default.)

Additional notes

The following command creates a shell variable, not a shell array:

array=`find . -name "${input}"`

If you wanted to create an array, you would need to put parens around the output of find. So, naively, one could:

array=(`find . -name "${input}"`)  # don't do this

The problem is that the shell performs word splitting on the results of find so that the elements of the array are not guaranteed to be what you want.

Update 2019

Starting with version 4.4-alpha, bash now supports a -d option so that the above loop is no longer necessary. Instead, one can use:

mapfile -d $'\0' array < <(find . -name "${input}" -print0)

For more information on this, please see (and upvote) Benjamin W.'s answer.

Tubuliflorous answered 29/4, 2014 at 6:38 Comment(31)
Awesome! Thanks. But could you explain little more about last line? I mean redirection part. At first, I write "<<" and it show syntax error on "(". After several tries, I just copy your code to mine and it works. what is the meaning of '<' in last line?Shick
@JuneyoungOh Glad it helped. I added a section of process substitution.Tubuliflorous
I think the IFS= part before read is useless. From man read: "...the first word is assigned to the first name, the second word to the second name, and so on, with leftover words and their intervening separators assigned to the last name...The characters in IFS are used to split the line into words", and "If no names are supplied, the line read is assigned to the variable REPLY". Since there's no argument supplied, the whole line is assigned to the only variable REPLY. The word separator specified by IFS is not useful here, because the line is not at all split into words.Taeniafuge
@Taeniafuge That is a good observation but incomplete. While it is true that we don't split into multiple words, we still need IFS= to avoid removal of whitespace from the beginnings or ends of the input lines. You can test this easily by comparing the output of read var <<<' abc '; echo ">$var<" with the output of IFS= read var <<<' abc '; echo ">$var<". In the former case, the spaces before and after abc are removed. In the latter, they aren't. File names that begin or end with whitespace may be unusual but, it they exist, we want them processed correctly.Tubuliflorous
Hi, after i execute your code i get message syntax error near unexpected token <' done < <(find aaa/ -not -newermt "$last_build_timestamp_v" -type f -print0)'Kidwell
@PrzemysławSienkiewicz I would need to see more details on what you are doing to be sure but, on many systems, the default shell doesn't support either process substitution or arrays. One may need to explicitly invoke the script with bash like bash scriptname. If that doesn't solve it, then a good next step is to run your script through shellcheck.net. (Just cut-and-paste your script into shellcheck's window and wait a moment for a response.)Tubuliflorous
How do you adjust this such that the array entries are not the full file path but only the file name?Propaedeutic
@Propaedeutic Use -printf like: while IFS= read -r -d $'\0'; do array+=("$REPLY"); done < <(find . -name "${input}" -printf '%f\0')Tubuliflorous
@Tubuliflorous I actually tried this a number of times. It does not seem to add the elements into the array when using printf '%f\0'Propaedeutic
@zuks It worked fine for me. I just copied-and-pasted from my comment to the command line and it still works for me. Did you have the variable input set to a useful value? To make sure that all the variables are initialized, try input='*'; array=(); while IFS= read -r -d $'\0'; do array+=("$REPLY"); done < <(find . -name "${input}" -printf '%f\0')Tubuliflorous
A note: the simpler '' can be used instead of $'\0': n=0; while IFS= read -r -d '' line || [ "$line" ]; do echo "$((++n)):$line"; done < <(printf 'first\nstill first\0second\0third')Underworld
How do I pipe the find output to cut then have those results added to the array? I would like to use something likefind /DIRECTORY -maxdepth 1 -type d | cut -f5 -d'/' in the done line.Caviness
@Caviness Yes, you can put any useful command in the process substitution as long as it produces NUL-separated output. After switching from newline to NUL-separation, your command, if you are using GNU tools, becomes find 1/2/3/4 -maxdepth 1 -type d -printf '%p\0' | cut -zf5 -d'/'.Tubuliflorous
@theeagle I assume that you intended to write BLAH=$(find . -name '*.php'). As discussed in the answer, that approach will work in limited cases but it won't work in general with all filenames and it doesn't produce, as the OP expected, an array.Tubuliflorous
Might be worth to mention the shopt -s lastpipe trick to make find | while read … work. (Note that that may not work in interactive mode, only in a script file, for reasons that are beyond me.)Physoclistous
@Physoclistous Interesting idea. Thanks! I added lastpipe to the answer. (By the way, in interactive mode, job control is on by default and job control disables lastpipe. This can be overcome by running set +m to disable job control.)Tubuliflorous
The 2019 version doesn't work for me with -print0. It works if I use mapfile -d$'\n' with -print.Jinx
@maharvey67 Do you use GNU find?Escadrille
@Tubuliflorous i have a function which uses the find query to list all the files in the sub directory . find $1 -type f -follow works what i need . I then tried to save the results with array=() and then mapfile -d$'\0' array < <(find $1 -type f -follow -print0) . But echo ${array[*]} does not give me anything , where i am getting it wrong ? I am using GNU bash, version 4.4.20(1)-release (x86_64-pc-linux-gnu)Leuko
@Leuko That's my fault: there was a space missing in the command. Try: mapfile -d $'\0' array < <(find "$1" -type f -follow -print0) with a space between -d and $'\0'. Separately, in case the directory name $1 might contain whitespace or other difficult characters, it's best to put it in double-quotes like "$1".Tubuliflorous
@Tubuliflorous ,even mapfile -d $'\0' array < <(find "$1" -type f -follow -print0) does not work . I can confirm that find "$1" -type f -follow -print0 does work however . It gives me list of all the files in the directories and sub-directories.Leuko
@Leuko That's frustrating. I just cut-and-paste your command and it works for me. I'm using bash 5.0.3. To make sure that the command is truly running under bash, please try bash -c 'mapfile -d $'\0' array < <(find . -type f -follow -print0); declare -p array'Tubuliflorous
@Leuko Are you the command in a script? If so, what is the shebang line and how are you executing the script?Tubuliflorous
@Tubuliflorous I have created a script called temp.sh . The contents are #!/bin/bash list_allFiles(){ mapfile -d $'\0' array < <(find "$1" -type f -follow -print0) } echo ${array[0]} dir=/home/abhishek/Desktop/CountryNames list_allFiles "$dir" And i am running the script with the command bash temp.sh in the terminal .Leuko
@Leuko Thanks! That makes it clear: the command echo ${array[0]} is executed after the function list_allFiles is defined but before it is executed. Move the echo command to the line after the line list_allFiles "$dir".Tubuliflorous
@BenjaminW.Yes it is GNU find. -d$'\0' (no space) does not work, but -d $'\0' (with a space) does work with -print0. -d $'\0' seems equivalent to -d '' perhaps because \0 looks the same as an empty string in C. The -d$'\n' works with -print, with or without a space between. Also... this is a very useful technique, thanks so much!Jinx
Why are there two spaces after IFS=?Volding
@Tubuliflorous Is there something wrong with the following approach? https://mcmap.net/q/56503/-how-can-i-store-the-quot-find-quot-command-results-as-an-array-in-bashVillosity
@Villosity Newline characters are legal in file names. The approach in that link works as long as no file name contains a newline character. If the file names are all names that you created and you know that they don't have newlines, then that approach may be 'good enough.' That kind of approach, however, is often not popular on SO where programmers typically place great value on code that is known to work always.Tubuliflorous
@Tubuliflorous Thanks for the feedback! Very appreciated. Do you know of any solution that works both using Bash and Z Shell?Villosity
@Villosity You're welcome. I don't know of such a solution but I'm not the right person to answer that: my knowledge of Z Shell is limited.Tubuliflorous
E
123

Bash 4.4 introduced a -d option to readarray/mapfile, so this can now be solved with

readarray -d '' array < <(find . -name "$input" -print0)

for a method that works with arbitrary filenames including blanks, newlines, and globbing characters. This requires that your find supports -print0, as for example GNU find does.

From the manual (omitting other options):

mapfile [-d delim] [array]

-d
The first character of delim is used to terminate each input line, rather than newline. If delim is the empty string, mapfile will terminate a line when it reads a NUL character.

And readarray is just a synonym of mapfile.

Escadrille answered 6/2, 2019 at 19:53 Comment(3)
This is great, I've already given it a +1. There's one caveat though -- if the command inside the process substitution fails, the exit code of the overall command is still 0. Is there a good way to have the exit code propagated to the outer command?Sino
@Sino Inspired by this answer, you could print the exit status as part of the process substitution: readarray -d '' array < <(find . -name "$input" -print0; printf "$?"), and then examine the last array element: echo "${array[-1]}".Escadrille
^^ This is brilliant!Sino
V
43

The following appears to work for both Bash and Z Shell on macOS.

#! /bin/sh

IFS=$'\n'
paths=($(find . -name "foo"))
unset IFS

printf "%s\n" "${paths[@]}"
Villosity answered 19/9, 2020 at 12:52 Comment(6)
This works with files having spaces and other special characters, fails with the (admittedly rare) case of files having a linebreak in their name. You can create one for a test with printf "%b" "file name with spaces, a star * ...\012and a second line\0" | xargs -0 touch Bandore
maybe I'm missing something here, but this seems like the much clearer, easier solution for 99% of casesMacrae
Definitely works great for zsh on macOS Big Sur :) thanks! - but I also know my fileset has no names with newlines, because who does that? I have never seen one in the wild and I made the files so I know its not an issue.Neolithic
Newlines are an issue in case the script may operate on files that are supplied by a potentially malicious user. For a hypothetical example, if your system ran something like detect-malware "${paths[@]}", a virus could be smuggled past this defense by including a newline in its name.Bolometer
See Bash Pitfalls #1 (for f in $(ls *.mp3)).Sweet
Thank you so much. Every other answer I read on multiple pages wasn't working for me. It turns out that ${paths[0]} is not the same thing as $paths[0]Film
D
20

If you are using bash 4 or later, you can replace your use of find with

shopt -s globstar nullglob
array=( **/*"$input"* )

The ** pattern enabled by globstar matches 0 or more directories, allowing the pattern to match to an arbitrary depth in the current directory. Without the nullglob option, the pattern (after parameter expansion) is treated literally, so with no matches you would have an array with a single string rather than an empty array.

Add the dotglob option to the first line as well if you want to traverse hidden directories (like .ssh) and match hidden files (like .bashrc) as well.

Deference answered 29/4, 2014 at 17:58 Comment(5)
Maybe nullglob too…Dinger
Yeah, I always forget that.Deference
Note that this will not include the hidden files and directories, unless dotglob is set (this may or may not be wanted, but it's worth mentioning too).Measureless
This looks very useful, unless you actually need find's more interesting file matching features which aren't name glob based (for example, find by type, date, etc).Riant
Indeed. find still has it uses (unless you are using zsh, in which case I think just about anything find can do you can do with some unreadable set of glob qualifiers :) )Deference
M
13

you can try something like

array=(`find . -type f | sort -r | head -2`)
, and in order to print the array values , you can try something like echo "${array[*]}"
Monkhood answered 6/8, 2015 at 6:2 Comment(1)
Breaks if there are filenames with spaces or glob characters.Measureless
N
2

None of these solutions suited me because I didn't feel like learning readarray and mapfile. Here is what I came up with.

#!/bin/bash

echo "input : "
read input

echo "searching file with this pattern '${input}' under present directory"
# The only change is here. Append to array for each non-empty line.
array=()
while read line; do
    [[ ! -z "$line" ]] && array+=("$line")
done; <<< $(find . -name ${input} -print)

len=${#array[@]}
echo "found : ${len}"

i=0

while [ $i -lt $len ]
do
echo ${array[$i]}
let i++
done
Napalm answered 27/10, 2021 at 15:30 Comment(1)
I like this one. But shellcheck asked me to remove the semicolon in this line done; <<<Jiggermast
R
-1

You could do like this:

#!/bin/bash
echo "input : "
read input

echo "searching file with this pattern '${input}' under present directory"
array=(`find . -name '*'${input}'*'`)

for i in "${array[@]}"
do :
    echo $i
done
Remediosremedy answered 29/4, 2014 at 7:7 Comment(1)
Thanks. a lot. But as @anishsane pointed, empty spaces in filename should be considered in my program. Anyway Thanks!Shick
R
-2

In bash, $(<any_shell_cmd>) helps to run a command and capture the output. Passing this to IFS with \n as delimiter helps to convert that to an array.

IFS='\n' read -r -a txt_files <<< $(find /path/to/dir -name "*.txt")
Recall answered 26/1, 2018 at 9:43 Comment(1)
This will get only the first file of the results of find into the array.Escadrille

© 2022 - 2024 — McMap. All rights reserved.