Suppressing summary information in `wc -l` output

Asked 29/12, 2016 at 18:32 Answered 4/8, 2023 at 10:4

I use the command wc -l count number of lines in my text files (also i want to sort everything through a pipe), like this:

wc -l $directory-path/*.txt | sort -rn

The output includes "total" line, which is the sum of lines of all files:

10 total
5 ./directory/1.txt
3 ./directory/2.txt
2 ./directory/3.txt

Is there any way to suppress this summary line? Or even better, to change the way the summary line is worded? For example, instead of "10", the word "lines" and instead of "total" the word "file".

Mars answered 29/12, 2016 at 18:32 Comment(4)

The man page for wc doesn't mention any such functionality. You can whip up a script (or probably use pipes and awk) to change the appearance of the output. – Karilla 29/12, 2016 at 18:36

Pipe it to tail +2 to skip the first line. – Chilson 29/12, 2016 at 18:43

@Barmar: That's unreliable. It only prints the total line if there's more than one file. And at least on my system, the total line is printed last -- as POSIX specifically requires. ipo: Do you really get the output you show, with the 10 total line at the top? – Archerfish 29/12, 2016 at 20:16

Based on your comments, I think you're seeing 10 total at the top because you're sorting the output. You need to mention that in the question. Show us the exact command you're running, and its exact output. And $directory-path is not a valid variable name. – Archerfish 29/12, 2016 at 22:18

Yet a `sed` solution!

1. short and quick

As total are comming on last line, $d is the sed command for deleting last line.

wc -l $directory-path/*.txt | sed '$d'

2. with header line addition:

wc -l $directory-path/*.txt | sed '$d;1ilines total'

Unfortunely, there is no alignment.

3. With alignment: formatting left column at 11 char width.

wc -l $directory-path/*.txt |
    sed -e '
        s/^ *\([0-9]\+\)/          \1/;
        s/^ *\([0-9 ]\{11\}\) /\1 /;
        /^ *[0-9]\+ total$/d;
        1i\      lines filename'

Will do the job

      lines file
          5 ./directory/1.txt
          3 ./directory/2.txt
          2 ./directory/3.txt

4. But if really your `wc` version could put total on 1st line:

This one is for fun, because I don't belive there is a wc version that put total on 1st line, but...

This version drop total line everywhere and add header line at top of output.

wc -l $directory-path/*.txt |
    sed -e '
        s/^ *\([0-9]\+\)/          \1/;
        s/^ *\([0-9 ]\{11\}\) /\1 /;
        1{
            /^ *[0-9]\+ total$/ba;
            bb;
           :a;
            s/^.*$/      lines file/
        };
        bc;
       :b;
        1i\      lines file' -e '
       :c;
        /^ *[0-9]\+ total$/d
    '

This is more complicated because we won't drop 1st line, even if it's total line.

Jaenicke answered 30/12, 2016 at 0:0 Comment(5)

I'm reasonably sure he's seeing the total on the first line because he's sorting the output. He's mentioned this in comments, but needs to say so in the question. And there's no indication that he wants or needs the "lines filename" header that your solutions produce. – Archerfish 30/12, 2016 at 0:22

Seems too complicated for such a small operation. – Bamford 30/12, 2016 at 1:21

@KeithThompson Asker said: For example, instead of "10", the word "lines" and instead of "total" the word "file" ! – Jaenicke 30/12, 2016 at 7:29

@Bamford It seem complicated because you won't try to understand. I'ts not as simple, but it's very quick and work as standalone solution – Jaenicke 30/12, 2016 at 7:30

@Bamford Ok, there is a simplier sed version: 2 characters! – Jaenicke 30/12, 2016 at 7:39

This is actually fairly tricky.

I'm basing this on the GNU coreutils version of the wc command. Note that the total line is normally printed last, not first (see my comment on the question).

wc -l prints one line for each input file, consisting of the number of lines in the file followed by the name of the file. (The file name is omitted if there are no file name arguments; in that case it counts lines in stdin.)

If and only if there's more than one file name argument, it prints a final line containing the total number of lines and the word total. The documentation indicates no way to inhibit that summary line.

Other than the fact that it's preceded by other output, that line is indistinguishable from output for a file whose name happens to be total.

So to reliably filter out the total line, you'd have to read all the output of wc -l, and remove the final line only if the total length of the output is greater than 1. (Even that can fail if you have files with newlines in their names, but you can probably ignore that possibility.)

A more reliable method is to invoke wc -l on each file individually, avoiding the total line:

for file in $directory-path/*.txt ; do wc -l "$file" ; done

And if you want to sort the output (something you mentioned in a comment but not in your question):

for file in $directory-path/*.txt ; do wc -l "$file" ; done | sort -rn

If you happen to know that there are no files named total, a quick-and-dirty method is:

wc -l $directory-path/*.txt | grep -v ' total$'

If you want to run wc -l on all the files and then filter out the total line, here's a bash script that should do the job. Adjust the *.txt as needed.

#!/bin/bash

wc -l *.txt > .wc.out
lines=$(wc -l < .wc.out)
if [[ lines -eq 1 ]] ; then
    cat .wc.out
else
    (( lines-- ))
    head -n $lines .wc.out
fi
rm .wc.out

Another option is this Perl one-liner:

wc -l *.txt | perl -e '@lines = <>; pop @lines if scalar @lines > 1; print @lines'

@lines = <> slurps all the input into an array of strings. pop @lines discards the last line if there are more than one, i.e., if the last line is the total line.

Archerfish answered 29/12, 2016 at 20:21 Comment(5)

Thanks for the detailed comment. But i have to use wc -l at the end, because i also have to sort them. Thats not possible, when I do wc -l on each file. The quick-and-dirty method is also not so good. Maybe i have a file named 'total'. – Mars 29/12, 2016 at 21:39

@ipo: Sure you can sort the output: for file in $directory-path/*.txt ; do wc -l "$file" ; done | sort -rn. (I'm assuming you're using a Bourne-derived shell like bash.) – Archerfish 29/12, 2016 at 22:14

@gniourf_gniourf: Done. (I thought I had; not sure how I missed that.) – Archerfish 29/12, 2016 at 22:20

You miss: /bin/ls -1 *.txt | xargs -n1 wc -l and/or find . -maxdepth 1 -name '*.txt' -exec wc -l {} \; ;-) – Jaenicke 30/12, 2016 at 8:29

@F.Hauri: I wouldn't say I "missed" those. I didn't intend to show all possible solutions. – Archerfish 30/12, 2016 at 17:10

The program wc, always displays the total when they are two or more than two files ( fragment of wc.c):

if (argc > 2)
     report ("total", total_ccount, total_wcount, total_lcount);
   return 0;

also the easiest is to use wc with only one file and find present - one after the other - the file to wc:

find $dir -name '*.txt' -exec wc -l {} \;

Or as specified by liborm.

dir="."
find $dir -name '*.txt' -exec wc -l {} \; | sort -rn | sed 's/\.txt$//'

Columbary answered 29/12, 2016 at 20:39 Comment(7)

Thats nearly the solution! But i need to pipe this one as well to | sort -rn | sed 's/\.txt$//' Where should i place this pipe? I tried find $dicitonary-path/*.txt-exec wc -l {} \ | sort -rn | sed 's/\.txt$//'; ...but this is wrong. – Mars 29/12, 2016 at 22:1

I think you're missing a -name argument in your find command. – Archerfish 29/12, 2016 at 22:15

@ipo like that, but without the typos.. find $PATH -name '*.txt' -exec wc -l {} \; | sort -rn | sed 's/\.txt$//' – Danged 29/12, 2016 at 22:32

@Danged :Thank's, i have put your cmd inside my response. If it's a problem, i can remove it. – Columbary 29/12, 2016 at 23:43

@Keith Thompson : You're right, thank's for your help. – Columbary 29/12, 2016 at 23:50

@ipo: Why would you want to strip the .txt portion of the file names (sed 's/\.txt$//')? You really need to update your question and state the problem more precisely. Read this: minimal reproducible example – Archerfish 29/12, 2016 at 23:53

It's 2 or more files, not more than 2 files. argc is the number of arguments including argv[0], which is the program name ("wc"). – Archerfish 30/12, 2016 at 0:23

This is a job tailor-made for head:

wc -l | head --lines=-1

This way, you can still run in one process.

Laundrywoman answered 2/5, 2022 at 12:49 Comment(1)

There are a lot of complicated solutions from people having fun with the problem, but head -n -1 before sorting seems best. Surprising that wc does not have a quiet or script use mode. – Squarerigger 15/7, 2022 at 14:28

Can you use another wc ?

The POSIX wc(man -s1p wc) shows
If more than one input file operand is specified, an additional line shall be written, of the same format as the other lines, except that the word total (in the POSIX locale) shall be written instead of a pathname and the total of each column shall be written as appropriate. Such an additional line, if any, is written at the end of the output.

You said the Total line was the first line, the manual states its the last and other wc's don't show it at all. Removing the first or last line is dangerous, so I would grep -v the line with the total (in the POSIX locale...), or just grep the slash that's part of all other lines:

wc -l $directory-path/*.txt | grep "/"

Och answered 29/12, 2016 at 20:16 Comment(0)

Not the most optimized way since you can use combinations of cat, echo, coreutils, awk, sed, tac, etc., but this will get you want you want:

wc -l ./*.txt | awk 'BEGIN{print "Line\tFile"}1' | sed '$d'

wc -l ./*.txt will extract the line count. awk 'BEGIN{print "Line\tFile"}1' will add the header titles. The 1 corresponds to the first line of the stdin. sed '$d' will print all lines except the last one.

Example Result

Line    File
      6 ./test1.txt
      1 ./test2.txt

Venial answered 29/12, 2016 at 21:2 Comment(2)

All i get is something like this 'Line File' above '10 total'. So like your example, but with the total-information again. – Mars 29/12, 2016 at 21:52

@ipo: what kind of system are you running? I'm using zsh on a OSX system. My total line count appears at the end. Try using this: wc -l ./*.txt | awk 'BEGIN{print "Line\tFile"}1' | sed '2d'. The only difference is that the sed should delete the 2nd line, not the last line now. – Venial 29/12, 2016 at 21:56

The simplicity of using just `grep -c`

I rarely use wc -l in my scripts because of these issues. I use grep -c instead. Though it is not as efficient as wc -l, we don't need to worry about other issues like the summary line, white space, or forking extra processes.

For example:

/var/log# grep -c '^' *
alternatives.log:0
alternatives.log.1:3
apache2:0
apport.log:160
apport.log.1:196
apt:0
auth.log:8741
auth.log.1:21534
boot.log:94
btmp:0
btmp.1:0
<snip>

Very straight forward for a single file:

line_count=$(grep -c '^' my_file.txt)

Performance comparison: `grep -c` vs `wc -l`

/tmp# ls -l *txt
-rw-r--r-- 1 root root 721009809 Dec 29 22:09 x.txt
-rw-r----- 1 root root 809338646 Dec 29 22:10 xyz.txt

/tmp# time grep -c '^' *txt

x.txt:7558434
xyz.txt:8484396

real    0m12.742s
user    0m1.960s
sys 0m3.480s

/tmp/# time wc -l *txt
   7558434 x.txt
   8484396 xyz.txt
  16042830 total

real    0m9.790s
user    0m0.776s
sys 0m2.576s

Bamford answered 29/12, 2016 at 22:0 Comment(3)

But grep -c . counts non-empty lines. You'll probably want grep -c '' as an approximation of wc -l (the two differ by one if the last “line” doesn't end with a newline). – Marci 29/12, 2016 at 22:16

Wonderful observation, @gniourf_gniourf. I changed the command to grep -c '^'. – Bamford 29/12, 2016 at 22:18

grep -c '^' also differs by one from wc -l if the last line doesn't end with a newline. In fact grep (at least the GNU version) always silently appends a newline if the last line doesn't have one. – Archerfish 30/12, 2016 at 0:28

You can solve it (and many other problems that appear to need a for loop) quite succinctly using GNU Parallel like this:

parallel wc -l ::: tmp/*txt

Sample Output

   3 tmp/lines.txt
   5 tmp/unfiltered.txt
  42 tmp/file.txt
   6 tmp/used.txt

Squabble answered 29/12, 2016 at 22:16 Comment(2)

parallel -j1 if your files are really big, otherwise you'll clog your disk with parallel requests for data.. – Danged 29/12, 2016 at 22:34

Possibly, though many folk run very fast SSDs nowadays and there was no indication that OP is using excessively large files and it could actually be an advantage to use GNU Parallel there anyway. – Squabble 29/12, 2016 at 22:39

Similar to Mark Setchell's answer you can also use xargs with an explicit separator:

ls | xargs -I% wc -l %

Then xargs explicitly doesn't send all the inputs to wc, but one operand line at a time.

Halford answered 3/5, 2021 at 14:30 Comment(0)

Shortest answer:

ls | xargs -l wc

Impound answered 31/3, 2022 at 12:14 Comment(0)

What about using sed with the pattern removal option as below which would only remove the total line if it is present (but also any files with total in them).

wc -l $directory-path/*.txt | sort -rn | sed '/total/d'

Nevsa answered 8/6, 2022 at 7:20 Comment(0)

While most of the answers center around removing the unneeded line, or using a version of wc that allows suppressing it, there's something to be said in favor of never producing it in the first place.

So you want to count lines in $directory-path/*.txt files, however feeding several files to wc will produce the total — which you don't want.

I would change your pipeline to find the files and feeding them to wc one by one, in this manner:

find $directory-path -name "*.txt" | xargs -L 1 wc -l | sort -rn

In this case, find is tasked with locating files, while xargs -L 1 is tasked with feeding them to wc one by one.

Moises answered 4/8, 2023 at 10:4 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Yet a sed solution!

1. short and quick

2. with header line addition:

3. With alignment: formatting left column at 11 char width.

4. But if really your wc version could put total on 1st line:

The simplicity of using just grep -c

Performance comparison: grep -c vs wc -l

Recommended topics

Hot tags

Yet a `sed` solution!

4. But if really your `wc` version could put total on 1st line:

The simplicity of using just `grep -c`

Performance comparison: `grep -c` vs `wc -l`