How do I use Head and Tail to print specific lines of a file
Asked Answered
D

4

22

I want to say output lines 5 - 10 of a file, as arguments passed in.

How could I use head and tail to do this?

where firstline = $2 and lastline = $3 and filename = $1.

Running it should look like this:

./lines.sh filename firstline lastline
Duologue answered 1/4, 2013 at 17:6 Comment(0)
T
16

Aside from the answers given by fedorqui and Kent, you can also use a single sed command:

#!/bin/sh
filename=$1
firstline=$2
lastline=$3

# Basics of sed:
#   1. sed commands have a matching part and a command part.
#   2. The matching part matches lines, generally by number or regular expression.
#   3. The command part executes a command on that line, possibly changing its text.
#
# By default, sed will print everything in its buffer to standard output.  
# The -n option turns this off, so it only prints what you tell it to.
#
# The -e option gives sed a command or set of commands (separated by semicolons).
# Below, we use two commands:
#
# ${firstline},${lastline}p
#   This matches lines firstline to lastline, inclusive
#   The command 'p' tells sed to print the line to standard output
#
# ${lastline}q
#   This matches line ${lastline}.  It tells sed to quit.  This command 
#   is run after the print command, so sed quits after printing the last line.
#   
sed -ne "${firstline},${lastline}p;${lastline}q" < ${filename}

Or, to avoid any external utilites, if you're using a recent version of bash (or zsh):

#!/bin/sh

filename=$1
firstline=$2
lastline=$3

i=0
exec <${filename}  # redirect file into our stdin
while read ; do    # read each line into REPLY variable
  i=$(( $i + 1 ))  # maintain line count

  if [ "$i" -ge "${firstline}" ] ; then
    if [ "$i" -gt "${lastline}" ] ; then
      break
    else
      echo "${REPLY}"
    fi
  fi
done
Trigeminal answered 1/4, 2013 at 17:37 Comment(5)
@Duologue All Unix commands have a standard input (stdin) and a standard output (stdout). The commands sed, awk, and many others (including head and tail) read their input from stdin, process it, and write the results to stdout. By default, stdin/stdout are attached to the terminal. Using redirection (eg: COMMAND < FILE1 > FILE2), you can set stdin for COMMAND to read from FILE1 and stdout to write to FILE2. This is fundamental to Unix; you might want to check out a tutorial (ceri.memphis.edu/computer/docs/unix/bshell.htm) or a book to get a better understanding.Trigeminal
@Duologue Also, both awk and sed offer a great deal of power for text processing. If you're going to spend much time in Unix, these two commands (plus grep) will be very, very useful. They're worth learning, although they're easier to learn in little bits at a time. You might find sed & awk to be a useful book.Trigeminal
Thanks, great advice. I have a few things I am unsure of, like how does sed match lines with the code as simple as ${firstline},${lastline}p << what is actually going on there?Duologue
@Duologue The ${firstline},${lastline} match only applies to p (the print command); ${lastline}q will only match ${lastline}, so we only quit after the last line.Trigeminal
@Duologue You should really read a tutorial on the Unix shell. You're conflating two distinct concepts (arguments and redirection). The <${filename} tells the shell to connect the file whose name is in the shell variable filename to the stdin of the sed process. sed receives its arguments on the command line (-e and "${firstline},${lastline}p;${lastline}q" are arguments). These arguments tell sed how to process what it reads from stdin, and what to write to stdout.Trigeminal
D
42
head -n XX # <-- print first XX lines
tail -n YY # <-- print last YY lines

If you want lines from 20 to 30 that means you want 11 lines starting from 20 and finishing at 30:

head -n 30 file | tail -n 11
# 
# first 30 lines
#                 last 11 lines from those previous 30

That is, you firstly get first 30 lines and then you select the last 11 (that is, 30-20+1).

So in your code it would be:

head -n $3 $1 | tail -n $(( $3-$2 + 1 ))

Based on firstline = $2, lastline = $3, filename = $1

head -n $lastline $filename | tail -n $(( $lastline -$firstline + 1 ))
Denticulate answered 1/4, 2013 at 17:12 Comment(5)
could use tail -n +$2 insteadLecompte
@AurélienOoms not sure what you exactly mean here. In which case do you mean to use this? After head, I assume.Denticulate
I meant head -n $3 $1 | tail -n +$2.Lecompte
Combining your answer with @dado's.Lecompte
@fedorqui'SOstopharming' You can also use like this tail -n +${firstline} $filename | head -n $(($lastline -$firstline + 1))Sigler
T
16

Aside from the answers given by fedorqui and Kent, you can also use a single sed command:

#!/bin/sh
filename=$1
firstline=$2
lastline=$3

# Basics of sed:
#   1. sed commands have a matching part and a command part.
#   2. The matching part matches lines, generally by number or regular expression.
#   3. The command part executes a command on that line, possibly changing its text.
#
# By default, sed will print everything in its buffer to standard output.  
# The -n option turns this off, so it only prints what you tell it to.
#
# The -e option gives sed a command or set of commands (separated by semicolons).
# Below, we use two commands:
#
# ${firstline},${lastline}p
#   This matches lines firstline to lastline, inclusive
#   The command 'p' tells sed to print the line to standard output
#
# ${lastline}q
#   This matches line ${lastline}.  It tells sed to quit.  This command 
#   is run after the print command, so sed quits after printing the last line.
#   
sed -ne "${firstline},${lastline}p;${lastline}q" < ${filename}

Or, to avoid any external utilites, if you're using a recent version of bash (or zsh):

#!/bin/sh

filename=$1
firstline=$2
lastline=$3

i=0
exec <${filename}  # redirect file into our stdin
while read ; do    # read each line into REPLY variable
  i=$(( $i + 1 ))  # maintain line count

  if [ "$i" -ge "${firstline}" ] ; then
    if [ "$i" -gt "${lastline}" ] ; then
      break
    else
      echo "${REPLY}"
    fi
  fi
done
Trigeminal answered 1/4, 2013 at 17:37 Comment(5)
@Duologue All Unix commands have a standard input (stdin) and a standard output (stdout). The commands sed, awk, and many others (including head and tail) read their input from stdin, process it, and write the results to stdout. By default, stdin/stdout are attached to the terminal. Using redirection (eg: COMMAND < FILE1 > FILE2), you can set stdin for COMMAND to read from FILE1 and stdout to write to FILE2. This is fundamental to Unix; you might want to check out a tutorial (ceri.memphis.edu/computer/docs/unix/bshell.htm) or a book to get a better understanding.Trigeminal
@Duologue Also, both awk and sed offer a great deal of power for text processing. If you're going to spend much time in Unix, these two commands (plus grep) will be very, very useful. They're worth learning, although they're easier to learn in little bits at a time. You might find sed & awk to be a useful book.Trigeminal
Thanks, great advice. I have a few things I am unsure of, like how does sed match lines with the code as simple as ${firstline},${lastline}p << what is actually going on there?Duologue
@Duologue The ${firstline},${lastline} match only applies to p (the print command); ${lastline}q will only match ${lastline}, so we only quit after the last line.Trigeminal
@Duologue You should really read a tutorial on the Unix shell. You're conflating two distinct concepts (arguments and redirection). The <${filename} tells the shell to connect the file whose name is in the shell variable filename to the stdin of the sed process. sed receives its arguments on the command line (-e and "${firstline},${lastline}p;${lastline}q" are arguments). These arguments tell sed how to process what it reads from stdin, and what to write to stdout.Trigeminal
P
8

try this one-liner:

awk -vs="$begin" -ve="$end" 'NR>=s&&NR<=e' "$f"

in above line:

$begin is your $2
$end is your $3
$f is your $1
Padishah answered 1/4, 2013 at 17:11 Comment(5)
I prefer keeping it simple by using head and tail,Duologue
@Duologue ok, you are free to choose the solution most suits you. we have different definition of "simple". a) you need two processes with head and tail. awk is only one process. b) awk could also quit processing after reaching your $3 ( I didn't do it in my answer), if you have a monster file. c) awk could validating your $2 $3 e.g if $2>$3 then doesn't process the file at all. you have to write extra script for your head/tail. awk is simpler, isn't it?Padishah
Ok, could you breakdown your one-liner and explain it?Duologue
-vs="$begin" sets the awk-internal variable s to the value stored in the shell variable $begin. Same goes for -ve. The main part (the awk script) is the 'NR>=s&&NR<=e', which just means print the line if the line number is between s and e. 'NR==5' for example would just print line 5. You could also use the shell variables directly, but in that case can't use ' ' since the shell would not expand $foo inside single quotes: awk "NR>=$begin&&NR<=$end" filename But you could get in trouble with \-escaping stuff in scripts.Misunderstood
@JustinSane it works, (expand shell var in awk codes) but is not good practice. what if the $begin var need to be used 30 times? and what if later the var-name was changed into $start? also the special char cases..Padishah
W
5

Save this as "script.sh":

#!/bin/sh

filename="$1"
firstline=$2
lastline=$3
linestoprint=$(($lastline-$firstline+1))

tail -n +$firstline "$filename" | head -n $linestoprint

There is NO ERROR HANDLING (for simplicity) so you have to call your script as following:

./script.sh yourfile.txt firstline lastline

$ ./script.sh yourfile.txt 5 10

If you need only line "10" from yourfile.txt:

$ ./script.sh yourfile.txt 10 10

Please make sure that: (firstline > 0) AND (lastline > 0) AND (firstline <= lastline)

Wilkins answered 25/1, 2014 at 19:2 Comment(1)
the + in tail -n +$firstline is the key.Nopar

© 2022 - 2024 — McMap. All rights reserved.