Use find, wc, and sed to count lines

D

8

30

I was trying to use sed to count all the lines based on a particular extension.

find -name '*.m' -exec wc -l {} \; | sed ...

I was trying to do the following, how would I include sed in this particular line to get the totals.

Dragonhead answered 11/9, 2009 at 17:24 Comment(0)

S

57

You may also get the nice formatting from wc with :

wc `find -name '*.m'`

Savick answered 11/9, 2009 at 17:37 Comment(6)

Add -type f to avoid the case of a directory name that matches: wc $(find -type f -name '*.m') – Rinaldo 11/9, 2009 at 18:53

wc $(find -name '*.m') is prettier. – Lynsey 7/10, 2016 at 16:50

This will break for any file names which contain whitespace or other shell metacharacters. – Casper 22/8, 2021 at 13:0

@Casper How would you recover from white-spaces in this instance? I am currently have this issue. – Aubade 11/1, 2023 at 10:42

@Workingdollar mywiki.wooledge.org/BashFAQ/020 and/or several of the other answers on this page. – Casper 11/1, 2023 at 10:43

@Casper thank you for the documentation. I found that this worked most effectively for me find . -name '*.m' -exec wc {} \; – Aubade 11/1, 2023 at 10:53

G

19

Most of the answers here won't work well for a large number of files. Some will break if the list of file names is too long for a single command line call, others are inefficient because -exec starts a new process for every file. I believe a robust and efficient solution would be:

find . -type f -name "*.m" -print0 | xargs -0 cat | wc -l

Using cat in this way is fine, as its output is piped straight into wc so only a small amount of the files' content is kept in memory at once. If there are too many files for a single invocation of cat, cat will be called multiple times, but all the output will still be piped into a single wc process.

Garpike answered 7/1, 2012 at 7:17 Comment(3)

Or use the standard/portable form instead: find . -type f -name '*.m' -exec cat {} + | wc -l. – Claar 5/9, 2016 at 10:6

how to modify this to print out total lines for each file among with the file name? – Kigali 20/9, 2018 at 9:17

@Kigali Then trivially run -exec wc -l {} + instead of -print0 | xargs .... Using + with -exec might end up running more than one instance of wc -l and then you will need to sum the totals from each run for the overall total. Or if you don't care about totals, just remove those lines with grep -v; or, use -exec wc -l {} \; to run a separate instance of wc on each file, at a somewhat higher processing cost. – Casper 11/1, 2023 at 10:47

O

6

You can cat all files through a single wc instance to get the total number of lines:

find . -name '*.m' -exec cat {} \; | wc -l

Octane answered 11/9, 2009 at 17:32 Comment(0)

S

5

On modern GNU platforms wc and find take -print0 and -files0-from parameters that can be combined into a command that count lines in files with total at the end. Example:

find . -name '*.c' -type f -print0 | wc -l --files0-from=-

Sadducee answered 5/6, 2011 at 7:38 Comment(0)

L

4

you could use sed also for counting lines in place of wc:

 find . -name '*.m' -exec sed -n '$=' {} \;

where '$=' is a "special variable" that keep the count of lines

EDIT

you could also try something like sloccount

Ladylove answered 11/9, 2009 at 17:28 Comment(5)

find . -name '*.m' -exec sed -n 'where $=' {} \; Is this it? – Dragonhead 11/9, 2009 at 17:36

That is not the total though, added together. – Dragonhead 11/9, 2009 at 17:37

OK, I ended up with this. $ find . -name '*.m' -exec sed -n '$=' {} \; | sum - 22696 1 – Dragonhead 11/9, 2009 at 17:38

where '$=' is a "special variable" that keep the count of lines. You jest sire! $= represents an address and a command. The $ is last line, the command = is current line number, in conjunction with -n switch which supresses pattern space printout. The outcome is it counts the number of lines fed to it. – Warenne 13/12, 2011 at 13:45

@Ladylove Thanks. How to print the filename before the count? – Kigali 20/9, 2018 at 9:13

Z

3

Hm, solution with cat may be problematic if you have many files, especially big ones.

Second solution doesn't give total, just lines per file, as I tested.

I'll prefer something like this:

find . -name '*.m' | xargs wc -l | tail -1

This will do the job fast, no matter how many and how big files you have.

Zeller answered 11/9, 2009 at 17:48 Comment(1)

If there are too many files for a single command line, xargs will chunk them and this will only give the total for the final chunk. – Garpike 7/1, 2012 at 7:16

P

1

sed is not the proper tool for counting. Use awk instead:

find . -name '*.m' -exec awk '{print NR}' {} +

Using + instead of \; forces find to call awk every N files found (like with xargs).

Pastor answered 12/9, 2009 at 14:57 Comment(1)

Funny, I actually meant to say awk – Dragonhead 14/9, 2009 at 13:51

C

1

For big directories we should use:

find . -type f -name '*.m' -exec sed -n '$=' '{}' + 2>/dev/null | awk '{ total+=$1 }END{print total}' 

# alternative using awk twice
find . -type f -name '*.m' -exec awk 'END {print NR}' '{}' + 2>/dev/null | awk '{ total+=$1 }END{print total}'

Criminality answered 14/9, 2009 at 19:1 Comment(0)

Recommended topics

Hot tags