Delete third-to-last line of file using sed or awk
Asked Answered
C

4

6

I have several text files with different row numbers and I have to delete in all of them the third-to-last line . Here is a sample file:

bear
horse
window
potato
berry
cup

Expected result for this file:

bear
horse
window
berry
cup

Can we delete the third-to-last line of a file:
a. not based on any string/pattern.
b. based only on a condition that it has to be the third-to-last line

I have problem on how to index my files beginning from the last line. I have tried this from another SO question for the second-to-last line:

> sed -i 'N;$!P;D' output1.txt
Collision answered 24/9, 2020 at 12:29 Comment(1)
sed is the wrong tool for anything other than s/old/new/. If you're using any sed constructs other than s, g, and p (with -n) then you should be using awk instead.Sherronsherry
D
6

With tac + awk solution, could you please try following. Just set line variable of awk to line(from bottom) whichever you want to skip.

tac Input_file | awk -v line="3" 'line==FNR{next} 1' | tac

Explanation: Using tac will read the Input_file reverse(from bottom line to first line), passing its output to awk command and then checking condition if line is equal to line(which we want to skip) then don't print that line, 1 will print other lines.

2nd solution: With awk + wc solution, kindly try following.

awk -v lines="$(wc -l < Input_file)" -v skipLine="3" 'FNR!=(lines-skipLine+1)' Input_file

Explanation: Starting awk program here and creating a variable lines which has total number of lines present in Input_file in it. variable skipLine has that line number which we want to skip from bottom of Input_file. Then in main program checking condition if current line is NOT equal to lines-skipLine+1 then printing the lines.

3rd solution: Adding solution as per Ed sir's comment here.

awk -v line=3 '{a[NR]=$0} END{for (i=1;i<=NR;i++) if (i != (NR-line)) print a[i]}' Input_file

Explanation: Adding detailed explanation for 3rd solution.

awk -v line=3 '             ##Starting awk program from here, setting awk variable line to 3(line which OP wants to skip from bottom)
{
  a[NR]=$0                  ##Creating array a with index of NR and value is current line.
}
END{                        ##Starting END block of this program from here.
  for(i=1;i<=NR;i++){       ##Starting for loop till value of NR here.
    if(i != (NR-line)){     ##Checking condition if i is NOT equal to NR-line then do following.
      print a[i]            ##Printing a with index i here.
    }
  }
}
' Input_file                ##Mentioning Input_file name here.
Dalston answered 24/9, 2020 at 12:43 Comment(1)
and chances are using an awk variable is overkill and the OP would be happy with NR!=3'Sherronsherry
M
8

With ed

ed -s ip.txt <<< $'$-2d\nw'

# thanks Shawn for a more portable solution
printf '%s\n' '$-2d' w | ed -s ip.txt

This will do in-place editing. $ refers to last line and you can specify a negative relative value. So, $-2 will refer to last but second line. w command will then write the changes.

See ed: Line addressing for more details.

Miltie answered 24/9, 2020 at 13:6 Comment(2)
I love seeing people besides me using ed in answers. It will be a thing again! (Though I'd use printf '%s\n' '$-2d' w | ed -s ip.txt instead of relying on bashisms)Odious
@Odious all those posts have made an impression on me, I think this is the first I've tried it out for a SO answer.. I'll likely try to learn more about it and write a blog postMiltie
D
6

With tac + awk solution, could you please try following. Just set line variable of awk to line(from bottom) whichever you want to skip.

tac Input_file | awk -v line="3" 'line==FNR{next} 1' | tac

Explanation: Using tac will read the Input_file reverse(from bottom line to first line), passing its output to awk command and then checking condition if line is equal to line(which we want to skip) then don't print that line, 1 will print other lines.

2nd solution: With awk + wc solution, kindly try following.

awk -v lines="$(wc -l < Input_file)" -v skipLine="3" 'FNR!=(lines-skipLine+1)' Input_file

Explanation: Starting awk program here and creating a variable lines which has total number of lines present in Input_file in it. variable skipLine has that line number which we want to skip from bottom of Input_file. Then in main program checking condition if current line is NOT equal to lines-skipLine+1 then printing the lines.

3rd solution: Adding solution as per Ed sir's comment here.

awk -v line=3 '{a[NR]=$0} END{for (i=1;i<=NR;i++) if (i != (NR-line)) print a[i]}' Input_file

Explanation: Adding detailed explanation for 3rd solution.

awk -v line=3 '             ##Starting awk program from here, setting awk variable line to 3(line which OP wants to skip from bottom)
{
  a[NR]=$0                  ##Creating array a with index of NR and value is current line.
}
END{                        ##Starting END block of this program from here.
  for(i=1;i<=NR;i++){       ##Starting for loop till value of NR here.
    if(i != (NR-line)){     ##Checking condition if i is NOT equal to NR-line then do following.
      print a[i]            ##Printing a with index i here.
    }
  }
}
' Input_file                ##Mentioning Input_file name here.
Dalston answered 24/9, 2020 at 12:43 Comment(1)
and chances are using an awk variable is overkill and the OP would be happy with NR!=3'Sherronsherry
B
5

This might work for you (GNU sed):

sed '1N;N;$!P;D' file

Open a window of 3 lines in the file then print/delete the first line of the window until the end of the file.

At the end of the file, do not print the first line in the window i.e. the 3rd line from the end of the file. Instead, delete it, and repeat the sed cycle. This will try to append a line after the end of file, which will cause sed to bail out, printing the remaining lines in the window.

A generic solution for n lines back (where n is 2 or more lines from the end of the file), is:

sed ':a;N;s/[^\n]*/&/3;Ta;$!P;D' file 

Of course you could use:

tac file | sed 3d | tac

But then you would be reading the file 3 times.

Banded answered 24/9, 2020 at 15:34 Comment(1)
For a large input, tac | sed '3d' | tac will be faster than sed '1N;N;$!P;D' file (tested) while the command group of head and tail (see: https://mcmap.net/q/1594405/-delete-third-to-last-line-of-file-using-sed-or-awk) will be multiple times faster because it doesn't really read the lines but seek() for the line number and dump the content. Similar to this, tac is like cat, means fast, also sed '3d' is following, but sed folding lines will read and write the same lines again, for all the input.Yahrzeit
Y
2

To delete the 3rd-to-last line of a file, you can use head and tail:

{ head -n -3 file; tail -2 file; }

In case of a large input file, when perfomance matters, this is very fast, because it doesn't read and write line by line. Also, do not modify the semicolons and the spaces next to the brackets, see about commands grouping.


Or use sed with tac:

tac file | sed '3d' | tac

Or use awk with tac:

tac file | awk 'NR!=3' | tac
Yahrzeit answered 24/9, 2020 at 15:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.