How do I extract lines from a file using their line number on unix?
Asked Answered
P

6

13

Using sed or similar how would you extract lines from a file? If I wanted lines 1, 5, 1010, 20503 from a file, how would I get these 4 lines?

What if I have a fairly large number of lines I need to extract? If I had a file with 100 lines, each representing a line number that I wanted to extract from another file, how would I do that?

Precessional answered 6/1, 2010 at 23:6 Comment(0)
R
17

Something like "sed -n '1p;5p;1010p;20503p'. Execute the command "man sed" for details.

For your second question, I'd transform the input file into a bunch of sed(1) commands to print the lines I wanted.

Roee answered 6/1, 2010 at 23:9 Comment(2)
+1, the thing to look up for the second part of the answer is sed -fLodi
sed -n '1p;5p;1010p;20503p inputFile.txt > outputFile.txtNobby
A
6

with awk it's as simple as:

awk 'NR==1 || NR==5 || NR==1010' "file"
Airsick answered 6/1, 2010 at 23:10 Comment(2)
@michael, nonsense, awk can do that too.Barby
ennuikiller, yes, I was mostly commenting on +1 for using awk in this context, ghostdog74, so can perl, python, pure bash, etc. It's a matter of opinion on the right tool for the job.Lodi
B
3

@OP, you can do this easier and more efficiently with awk. so for your first question

awk 'NR~/^(1|2|5|1010)$/{print}' file

for 2nd question

awk 'FNR==NR{a[$1];next}(FNR in a){print}' file_with_linenr file
Barby answered 7/1, 2010 at 0:41 Comment(1)
The second response is a bit obfuscated. To explain: FNR==NR will occur only when reading file_with_linenr, not file. In this case, the text of the line is added to a set a, and execution skips to the next line of input. Thus when reading from file, only the (FNR in a) case applies, and prints the text of the relevant line if its number was put in a in parsing file_with_linenr.Solicit
A
1

This ain't pretty and it could exceed command length limits under some circumstances*:

sed -n "$(while read a; do echo "${a}p;"; done < line_num_file)" data_file

Or its much slower but more attractive, and possibly more well-behaved, sibling:

while read a; do echo "${a}p;"; done < line_num_file | xargs -I{} sed -n \{\} data_file

A variation:

xargs -a line_num_file -I{} sed -n \{\}p\; data_file

You can speed up the xarg versions a little bit by adding the -P option with some large argument like, say, 83 or maybe 419 or even 1177, but 10 seems as good as any.

*xargs --show-limits </dev/null can be instructive

Alvin answered 7/1, 2010 at 5:21 Comment(0)
F
0

I'd investigate Perl, since it has the regexp facilities of sed plus the programming model surrounding it to allow you to read a file line by line, count the lines and extract according to what you want (including from a file of line numbers).

my $row = 1
while (<STDIN>) {
   # capture the line in $_ and check $row against a suitable list.
   $row++;
}
Feel answered 6/1, 2010 at 23:8 Comment(3)
and you can use perl -e 'perlcode here' from the command prompt. Perl also has a range operator .. as in 3..12 which will allow you to create a list of numbers where needed.Complimentary
You should be using $., which automagically contains the current line numberIntendance
Anybody interested in Perl command line techniques might want to look at Minimal Perl, from Manning... manning.com/maherHekking
S
0

In Perl:

perl -ne 'print if $. =~ m/^(1|5|1010|20503)$/' file
Sher answered 17/3, 2010 at 19:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.