Count total number of matches in directory with ag
Asked Answered
P

4

22

I'm attempting to find the number of matches for a given string across a large project. Currently, to do this with ag I am using the following command:

$ echo 0$(ag -c searchterm | sed -e "s/^.*:/+/") | bc

which is obviously a bit lengthy and not very intuitive. Is there any better way to get the total number of matches in a directory from ag? I've dug through the documentation and couldn't find anything helpful there.

Edit: Thanks to a recent commit to ag, the filenames can be stripped with ag instead of sed, so this also works:

$ echo `ag test -c --nofilename | sed "s/$/+/"`0 | bc

Note: I realize I could do this with ack -hcl searchterm (Well, almost. In my specific case I'd need an --ignore-dir building in there as well), but as this is already a large project (and will be growing considerably), the speed boost offered by ag makes it preferable (ack takes about 3 seconds for my searches vs ag's nearly instantaneous result), so I would like to stick with it.

Paolapaolina answered 13/8, 2015 at 18:28 Comment(2)
Did you ever find an answer to this question?Balzer
@Balzer Not exactly. The best option seems to be the --stats option followed by parsing out the correct line. (e.g., ag --stats searchterm | tail -n 5 | head -n 1). I also submitted a pull request for a --stats-only option which prevents anything else from being printed, in that case ag --stats searchterm | head -n 1 would get the number of matches. In both those cases you'd still need to filter out " matches" to get just the number though.Paolapaolina
R
22

I use ag itself to match the stats. E.g.:

 >$ ag --stats --java -c 'searchstring' | ag '.*matches'
 >$ 22 matches 
 >$ 6 files contained matches

Filter with lookahead to print just the number of matches:

 >$ ag --stats --java -c 'searchstring' | ag -o '^[0-9]+(?=\smatches)'
 >$ 22   
Rabblerouser answered 9/2, 2016 at 16:54 Comment(3)
For those finding this now, the --stats-only option was added in github.com/ggreer/the_silver_searcher/pull/733Mcgraw
while better --stats-only still leaves one needing to filter out the cruff commentary (e.g. "xyz bytes searched") to get just the raw number of matches; for pipe chains and scripts, this is often what one needs.Rabblerouser
True. However, the original answer fails if the search string contains the word "matches". ag --stats-only 'searchstring' | ag '.*matches$' won't have that issue (note the trailing $ too).Mcgraw
M
6

ag -o --nofilename --nobreak 'searchstring' | wc -l

  • -o prints each match individually
  • --nofilename removes filenames from output
  • --nobreak removes newlines between matches in different files
Mabe answered 28/5, 2019 at 22:41 Comment(0)
P
5

Still no great solution, but here's what I've managed to come up with thusfar for anyone else who finds this:

If you're not searching huge amounts of files, just use ack -hcl searchterm, otherwise...

I have been able to improve the command in my question by leveraging the --stats option, which appends something like the following to the search results:

714 matches
130 files contained matches
300 files searched
123968435 bytes searched
0.126203 seconds 

For manual use, that's good enough (though it still floods the screen with all the matches), but for scripts I still need just the number. So, to that end, I've gone from the command in my question down to this:

$ ag --stats searchterm | tail -n5 | head -n1 | cut -d" " -f1

or the more succinct but less memorable

$ ag --stats searchterm | tac | awk 'NR==5 {print $1}'

(replace tac with tail -r if you don't have tac)

To save a bit more typing, I aliased the latter half of the command so I can just pipe ag --stats to my alias and get what I want. So, with alias agmatches='tac | awk "NR==5 {print \$1}' I can get just the matches by running ag --stats searchterm | agmatches.

Still would be much better if these was something built into ag to help facilitate this. I submitted a pull request for a --stats-only output option that would help, but nothing has come of that yet which is available if you build directly from the repo, but isn't yet in a stable release, so that should speed up the process a tidbit for large numbers of results.

Paolapaolina answered 20/8, 2015 at 18:23 Comment(0)
I
3

I like gregory's answer above, but to add some more context:

ag --stats --java -c 'searchstring' | ag '.*matches'

  • The --java flag indicates that ag will only search for files with .java (and .properties) extensions. So if you were searching within a python project for .py files, you would use the --python flag. Run the ag --list-file-types command for all the file types available for searching.
  • The -c or --count flag provides the number of matches.
Infarction answered 25/5, 2021 at 13:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.