Read n lines at a time using Bash
Asked Answered
M

17

70

I read the help read page, but still don't quite make sense. Don't know which option to use.

How can I read N lines at a time using Bash?

Many answered 29/11, 2011 at 16:50 Comment(0)
B
30

This is harder than it looks. The problem is how to keep the file handle.

The solution is to create another, new file handle which works like stdin (file handle 0) but is independent and then read from that as you need.

#!/bin/bash

# Create dummy input
for i in $(seq 1 10) ; do echo $i >> input-file.txt ; done

# Create new file handle 5
exec 5< input-file.txt

# Now you can use "<&5" to read from this file
while read line1 <&5 ; do
        read line2 <&5
        read line3 <&5
        read line4 <&5

        echo "Four lines: $line1 $line2 $line3 $line4"
done

# Close file handle 5
exec 5<&-
Bangs answered 30/11, 2011 at 9:48 Comment(8)
Maybe I misunderstand the problem, but just calling read repeatedly worked fine for me even without the input trickery.Meunier
Readying this again, I also don't know why I explained how to create a new file handle instead of using stdio :-/Bangs
Thanks for this, I used this as the basis for my solution. This works, but the additional file handle is not required. See below for a solution that is similar, but doesn't require the second handle.Deathful
You don't need to create the file, if you are getting the output from a command. You can do: mkfifo fifo; exec 5<>fifo; my-output-that-will-be-read >&5; read line1 <&5; read line2 <&5; ...Witty
@LuísGuilherme Any idea how to do this without a fifo? A fifo is just another entry in my filesystem which I have to clean up; it just makes debugging harder and wastes less disk space.Bangs
Aaron, I've added an answer below. Basically, using zsh's read, or bash's readarray or mapfileWitty
Much more simple! cat input-file.txt | xargs -L 10 echo https://mcmap.net/q/277575/-read-n-lines-at-a-time-using-bashStanford
@Stanford That doesn't give you the 10 lines in 10 different BASH variables.Bangs
S
58

With Bash≥4 you can use mapfile like so:

while mapfile -t -n 10 ary && ((${#ary[@]})); do
    printf '%s\n' "${ary[@]}"
    printf -- '--- SNIP ---\n'
done < file

That's to read 10 lines at a time.

Squid answered 21/12, 2016 at 17:18 Comment(8)
This seems to be the most efficient and elegant approach and hence should have been the accepted answer. It's great if explanation for && ((${#ary[@]})) is added.Indissoluble
@Indissoluble && ((${#ary[@]})) essentially means till there are lines to read (number of elements in the array return something)Emaemaciate
Agree, good-looking code. I did some performance-testing in case anyone is interested(Intel(R) Core(TM) i3-7100U CPU @ 2.40GHz; 12G RAM). Any Suggestions to do improve? $ seq 1 10000000 >input.txt $ ls -lh input.txt -rwxrwxrwx 1 dima dima 76M Feb 17 23:09 input.txt $ time ./sqlite_stdin_batch_transactions.sh 1000 <input.txt >/dev/null Using batch size 1000 real 0m44.791s user 0m24.922s sys 0m19.578s @codeforester, @anubhava, @EmaemaciatePlaytime
This is the least user-friendly answer: ary && ((${#ary[@]})) is by no means simple, nor is the printf portion, whereas even novice bash developer can understand https://mcmap.net/q/277575/-read-n-lines-at-a-time-using-bash, https://mcmap.net/q/277575/-read-n-lines-at-a-time-using-bash or https://mcmap.net/q/277575/-read-n-lines-at-a-time-using-bashSimplicity
@Oliver: thanks for your comment and your downvote. Unfortunately, programming is not simple. The first answer you link is difficult to scale: what if you want to read 120 lines at a time? what if you want to read N times at a time (where N is a variable)? The second answer you link is just broken, in case your stream/file contains single quotes. Maybe my method isn't user-friendly, but at least it is correct and scales without any problems.Squid
@Simplicity I won't edit it. Next time, think twice before you downvote, especially in areas you don't know very well... only downvote answers you KNOW are broken.Squid
@Squid I understand you want to "punish" Oliver but doing so you also "punish" other readers who don't understand the cryptic part of your solution.Pyrrhonism
While I don't agree with @Oliver's comment, I still commend you that you explained why you down-voted. This way people can decide whether on not they agree with your comment.Phytology
D
51

While the selected answer works, there is really no need for the separate file handle. Just using the read command on the original handle will function fine.

Here are two examples, one with a string, one with a file:

# Create a dummy file
echo -e "1\n2\n3\n4" > testfile.txt

# Loop through and read two lines at a time
while read -r ONE; do
    read -r TWO
    echo "ONE: $ONE TWO: $TWO"
done < testfile.txt

# Create a dummy variable
STR=$(echo -e "1\n2\n3\n4")

# Loop through and read two lines at a time
while read -r ONE; do
    read -r TWO
    echo "ONE: $ONE TWO: $TWO"
done <<< "$STR"

Running the above as a script would output (the same output for both loops):

ONE: 1 TWO: 2
ONE: 3 TWO: 4
ONE: 1 TWO: 2
ONE: 3 TWO: 4
Deathful answered 21/7, 2015 at 0:40 Comment(1)
Much more simple! cat input-file.txt | xargs -L 10 echo https://mcmap.net/q/277575/-read-n-lines-at-a-time-using-bashStanford
G
34

Simplest method - pretty self-explanatory. It is similar to the method provided by @Fmstrat, except the second read statement is before the do.

while read first_line; read second_line
do
    echo "$first_line" "$second_line"
done

You can use this by piping multiline input to it:

seq 1 10 | while read first_line; read second_line 
do
    echo "$first_line" "$second_line"
done

output:

1 2
3 4
5 6
7 8
9 10
Geiger answered 21/12, 2016 at 16:53 Comment(5)
The syntax seems weird, but help while actually mentions commands: while COMMANDS; do COMMANDS; done with the explanation Expand and execute COMMANDS as long as the final command in the 'while' COMMANDS has an exit status of zero.Tiptoe
seq -w 1 100 | while for i in {0..9}; do read $i; done; do echo "$0...$9"; doneThacher
@Mike-Furlender Just the solution I was looking for :+1:, thanks bro. Can I replace the ; by && instead : seq 1 10 | while read first_line && read second_line ... ?Seamy
@Seamy Yup, I believe so.Geiger
@Mike-Furlender Thanks.Seamy
B
30

This is harder than it looks. The problem is how to keep the file handle.

The solution is to create another, new file handle which works like stdin (file handle 0) but is independent and then read from that as you need.

#!/bin/bash

# Create dummy input
for i in $(seq 1 10) ; do echo $i >> input-file.txt ; done

# Create new file handle 5
exec 5< input-file.txt

# Now you can use "<&5" to read from this file
while read line1 <&5 ; do
        read line2 <&5
        read line3 <&5
        read line4 <&5

        echo "Four lines: $line1 $line2 $line3 $line4"
done

# Close file handle 5
exec 5<&-
Bangs answered 30/11, 2011 at 9:48 Comment(8)
Maybe I misunderstand the problem, but just calling read repeatedly worked fine for me even without the input trickery.Meunier
Readying this again, I also don't know why I explained how to create a new file handle instead of using stdio :-/Bangs
Thanks for this, I used this as the basis for my solution. This works, but the additional file handle is not required. See below for a solution that is similar, but doesn't require the second handle.Deathful
You don't need to create the file, if you are getting the output from a command. You can do: mkfifo fifo; exec 5<>fifo; my-output-that-will-be-read >&5; read line1 <&5; read line2 <&5; ...Witty
@LuísGuilherme Any idea how to do this without a fifo? A fifo is just another entry in my filesystem which I have to clean up; it just makes debugging harder and wastes less disk space.Bangs
Aaron, I've added an answer below. Basically, using zsh's read, or bash's readarray or mapfileWitty
Much more simple! cat input-file.txt | xargs -L 10 echo https://mcmap.net/q/277575/-read-n-lines-at-a-time-using-bashStanford
@Stanford That doesn't give you the 10 lines in 10 different BASH variables.Bangs
S
24

That is much more simple! :)

cat input-file.txt | xargs -L 10 ./do_something.sh

or

cat input-file.txt | xargs -L 10 echo
Stanford answered 19/11, 2018 at 14:0 Comment(4)
the latter example can be abbreviated to <input-file.txt xargs -L 10Christo
in Alpine Linux, it would be: xargs -n 10 instead. I like this way because you can use it with variables, like echo "$var" | xargs ... (not just files).Skirting
This approach has problems if the input lines have characters meaningful to the shell. E.g. text with a single apostrophe will become an unterminated quote and halt processing.Sprocket
Simple, elegant way of chunking lines, exactly what I needed. Thank you!Sarawak
J
11

I don't think there is a way to do it natively in bash, but one can create a convenient function for doing so:

#
# Reads N lines from input, keeping further lines in the input.
#
# Arguments:
#   $1: number N of lines to read.
#
# Return code:
#   0 if at least one line was read.
#   1 if input is empty.
#
function readlines () {
    local N="$1"
    local line
    local rc="1"

    # Read at most N lines
    for i in $(seq 1 $N)
    do
        # Try reading a single line
        read line
        if [ $? -eq 0 ]
        then
            # Output line
            echo $line
            rc="0"
        else
            break
        fi
    done

    # Return 1 if no lines where read
    return $rc
}

With this one can easily loop over N-line chunks of the data by doing something like

while chunk=$(readlines 10)
do
    echo "$chunk" | ... # Whatever processing
done

In this loop $chunk will contain 10 input lines at each iteration, except for the last one, which will contain the last lines of input, which might be less than 10 but always more than 0.

Jerkwater answered 30/5, 2013 at 12:56 Comment(2)
I really like this solution, but how do you call it? How do you pass it a file name?Almeida
The while loop above will read from standard input. If you want to input a file to it just cat it to stdin.Jerkwater
H
3

I came up with something very similar to @albarji's answer, but more concise.

read_n() { for i in $(seq $1); do read || return; echo $REPLY; done; }

while lines="$(read_n 5)"; do
    echo "========= 5 lines below ============"
    echo "$lines"
done < input-file.txt

The read_n function will read $1 lines from stdin (use redirection to make it read from a file, just like the built-in read command). Because the exit code from read is maintained, you can use read_n in a loop as the above example demonstrates.

Hydromancy answered 6/8, 2014 at 19:41 Comment(2)
Note that this does not output the last 1 to 4 lines, if the number of lines is not a multiple of 5.Axenic
Fails at reading the last lines, or the whole file if it's less than 5 lines long.Raskin
S
2

Depending on what you're trying to do, you can just store the previous lines.

LINE_COUNT=0
PREVLINE1=""
PREVLINE2=""
while read LINE
  do LINE_COUNT=$(($LINE_COUNT+1));
    if [[ $LINE_COUNT == 3 ]]; then
       LINE_COUNT=0
       # do whatever you want to do with the 3 lines
    done
    PREVLINE2="$PREVLINE1"
    PREVLINE1="$LINE"
  done
done < $FILE_IN
Suzannsuzanna answered 19/7, 2012 at 19:31 Comment(0)
E
1

Just use a for loop:

for i in $(seq 1 $N) ; do read line ; lines+=$line$'\n' ; done

In bash version 4, you can also use the mapfile command.

Elison answered 29/11, 2011 at 16:53 Comment(3)
yes, there is no explicit read N lines at a time in bash. You have to construct it with concatenation of values. @Elison 's on the right track here, but needs to provide source of input and manage more than one pass of input (reset line every N reads). Awk is really designed for such issues (but doesn't have an explicit 'read N lines' function either), you need to manage the N lines with the NR variable in awk. Sed could work too, but would really be ugly, and require a lot of messing around to make N generic and not hardcoded processing. Good luck to all.Adrianaadriane
I tried sth like: for i in $(seq 1 4);do read line <101127_2_aa_1.fastq;lines+=$line$'\n';done but seems didn't work...Many
Actually, how to provide input ?Many
C
1

I know you asked about bash, but I am amazed that this works with zsh

#!/usr/bin/env zsh    
cat 3-lines.txt | read -d\4 my_var my_other_var my_third_var

Unfortunately, this doesn't work with bash, at least the versions I tried.

The "magic" here is the -d\4 (this doesn't work in bash), that sets the line delimiter to be the EOT character, which will be found at the end of your cat. or any command that produces output.

If you want to read an array of N items, bash has readarray and mapfile that can read files with N lines and save every line in one position of the array.

EDIT

After some tries, I just found out that this works with bash:

$ read -d# a b
Hello
World
#
$ echo $a $b
Hello World
$

However, I could not make { cat /tmp/file ; echo '#'; } | read -d# a b to work :(

Corey answered 3/6, 2016 at 0:59 Comment(0)
H
1

The echo simulates a file with two lines input, use head -2 before paste if needed:

IFS=\; read A B < <(echo -en "X1 X2\nY1 Y2\n" | paste -s -d\;)

If you want to read lines in a loop and create pairs and lines have only single word in them use:

while read NAME VALUE; do 
    echo "$NAME=$VALUE"; 
done < <(echo -en "M\n1\nN\n2\nO\n3\n" | xargs -L2 echo)
Hazel answered 6/4, 2017 at 8:18 Comment(0)
F
0

Here's an alternative way of doing it:

//This will open the file and users can start adding variables.
cat > file
//After finished ctrl + D will close it
cat file|while read line;
do
  //do some stuff here
done
Fart answered 28/5, 2017 at 17:15 Comment(0)
S
0

Awk is a funny way for this case:

~$ cat test.txt 
tom
[email protected]
jack
[email protected]
marry
[email protected]
gogo
[email protected]
~$ cat test.txt | awk 'BEGIN{c=1}{a=c;if(a==2){print b" "$0;c=1} if(a==1){b=$0;c=2}}'
tom [email protected]
jack [email protected]
marry [email protected]
gogo [email protected]
Serrano answered 23/10, 2018 at 13:26 Comment(0)
S
0

After having looked at all the answers, I think the following is the simplest, ie more scripters would understand it better than any other solution, but only for small number of items:

while read -r var1 && read -r var2; do 
    echo "$var1" "$var2"
done < yourfile.txt

The multi-command approach is also excellent, but it is lesser known syntax, although still intuitive:

while read -r var1; read -r var2; do 
    echo "$var1" "$var2"
done < yourfile.txt

It has the advantage that you don't need line continuations for larger number of items:

while 
    read -r var1
    read -r var2
    ...
    read -r varN
do 
    echo "$var1" "$var2"
done < yourfile.txt

The xargs answer posted is also nice in theory, but in practice processing the combined lines is not so obvious. For example one solution I came up with using this technique is:

while read -r var1 var2; do 
    echo "$var1" "$var2"
done <<< $(cat yourfile.txt | xargs -L 2 )

but again this uses the lesser known <<< operator. However this approach has the advantage that if your script was initially

while read -r var1; do 
    echo "$var1"
done <<< yourfile.txt

then extending it for multiple lines is somewhat natural:

while read -r var1 var2; do 
    echo "$var1" "$var2"
done <<< $(cat endpoints.txt | xargs -L 2 )

The straightforward solution

while read -r var1; do
    read -r var2
    echo "$var1" "$var2"
done < yourfile.txt

is the only other one that I would consider among the many given, for its simplicity, but syntactically it is not as expressive; compared to the && version or multi-command version it does not feel as right.

Simplicity answered 27/5, 2019 at 14:8 Comment(2)
@Squid If I have to spend more than a few seconds analysing an expression to understand its meaning, it is not one I want to use or encourage using, that's just my opinion among all the other opinions. Scalability, generality, performance, all have big costs in complexity, so it's important to keep in mind when they are necessary vs nice to have. Maintainability is important too, it is here and now.Simplicity
Your first two examples are not equivalent. To avoid cat with xargs: xargs [OPTIONS] < filename. Anyway, xargs is known to be a bad solution: it will break with quotes, so it's certainly something you don't want to use.Squid
S
0

to read n+2 Lines from a file:-

2

4

6

8

.

.

so on

you can try this way:-

cat fileName | awk '!((NR - 0) % 2)'

Soakage answered 24/5, 2020 at 12:24 Comment(0)
A
0

Also you can group lines with awk:

$ seq -s ' ' 23 > file

$ cat file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

$ awk '(NR % 6 == 1) {print; for(i=1; i<6 && getline ; i++) { print }; printf "\n"}' RS=' ' ORS=' ' file
1 2 3 4 5 6 
7 8 9 10 11 12 
13 14 15 16 17 18 
19 20 21 22 23
 
Airlia answered 18/7, 2021 at 4:25 Comment(0)
R
0

Another option is to use the curly brace command grouping in bash.

{ read line1; read line2; } < test-file.txt

One thing to keep in mind tho is that if you have set -u then this will fail if the file being read has less lines than the number of variables you're attempting to fill. One solution is simply to add || true to the end of the above line.

Revamp answered 13/5, 2022 at 6:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.