ack misses results (vs. grep)
Asked Answered
W

4

46

I'm sure I'm misunderstanding something about ack's file/directory ignore defaults, but perhaps somebody could shed some light on this for me:

mbuck$ grep logout -R app/views/
Binary file app/views/shared/._header.html.erb.bak.swp matches
Binary file app/views/shared/._header.html.erb.swp matches
app/views/shared/_header.html.erb.bak: <%= link_to logout_text, logout_path, { :title => logout_text, :class => 'login-menuitem' } %>
mbuck$ ack logout app/views/
mbuck$

Whereas...

mbuck$ ack -u logout app/views/
Binary file app/views/shared/._header.html.erb.bak.swp matches
Binary file app/views/shared/._header.html.erb.swp matches
app/views/shared/_header.html.erb.bak
98:<%= link_to logout_text, logout_path, { :title => logout_text, :class => 'login-menuitem' } %>

Simply calling ack without options can't find the result within a .bak file, but calling with the --unrestricted option can find the result. As far as I can tell, though, ack does not ignore .bak files by default.

UPDATE

Thanks to the helpful comments below, here are the new contents of my ~/.ackrc:

--type-add=ruby=.haml,.rake
--type-add=css=.less
Wordbook answered 14/6, 2010 at 16:29 Comment(0)
C
52

ack is peculiar in that it doesn't have a blacklist of file types to ignore, but rather a whitelist of file types that it will search in.

To quote from the man page:

With no file selections, ack-grep only searches files of types that it recognizes. If you have a file called foo.wango, and ack-grep doesn't know what a .wango file is, ack-grep won't search it.

(Note that I'm using Ubuntu where the binary is called ack-grep due to a naming conflict)

ack --help-types will show a list of types your ack installation supports.

Chemism answered 14/6, 2010 at 16:33 Comment(3)
Great, thanks for the help! For anybody that's interested, the following page will give you a bit more info about adding unrecognized file types (like .haml) to ack: wiki.github.com/protocool/ack-tmbundle/recognizing-filesWordbook
The filetypes ack recognizes aren't just extensions. It will look at shebang lines as well. If you have a program "mywhatever" that starts "#!/usr/bin/perl", ack will know it's a Perl program.Claimant
Note that ack 2.0 changes this behavior.Claimant
C
13

If you are ever confused about what files ack will be searching, simply add the -f option. It will list all the files that it finds to be searchable.

Claimant answered 14/6, 2010 at 19:24 Comment(1)
To clarify, you must type ack -f on its own with no other arguments.Byzantine
P
12

ack --man states:

If you want ack to search every file, even ones that it always ignores like coredumps and backup files, use the "−u" switch.

and

Why does ack ignore unknown files by default? ack is designed by a programmer, for programmers, for searching large trees of code. Most codebases have a lot files in them which aren’t source files (like compiled object files, source control metadata, etc), and grep wastes a lot of time searching through all of those as well and returning matches from those files.

That’s why ack’s behavior of not searching things it doesn’t recognize is one of its greatest strengths: the speed you get from only searching the things that you want to be looking at.

EDIT: Also if you look at the source code, bak files are ignored.

Purifoy answered 14/6, 2010 at 16:36 Comment(3)
Interesting, thanks! Didn't realize they had hard-coded in the .bak ignore.Wordbook
ack is optimized specifically for the common case of "find code in a tree of source code." In that common case, you want to ignore .bak files. It is NOT intended to be a general-purpose search tool, although you can make it that if you jump through hoops. Better to simply use grep if you need a general tool.Claimant
-u is not available in ack version 2.Cymry
P
5

Instead of wrestling with ack, you could just use plain old grep, from 1973. Because it uses explicitly blacklisted files, instead of whitelisted filetypes, it never omits correct results, ever. Given a couple of lines of config (which I created in my home directory 'dotfiles' repo back in the 1990s), grep actually matches or surpasses many of ack's claimed advantages - in particular, speed: When searching the same set of files, grep is faster than ack.

The grep config that makes me happy looks like this, in my .bashrc:

# Custom 'grep' behaviour
# Search recursively
# Ignore binary files
# Output in pretty colors
# Exclude a bunch of files and directories by name
# (this both prevents false positives, and speeds it up)
function grp {
    grep -rI --color --exclude-dir=node_modules --exclude-dir=\.bzr --exclude-dir=\.git --exclude-dir=\.hg --exclude-dir=\.svn --exclude-dir=build --exclude-dir=dist --exclude-dir=.tox --exclude=tags "$@"
}

function grpy {
    grp --include=*.py "$@"
}

The exact list of files and directories to ignore will probably differ for you: I'm mostly a Python dev and these settings work for me.

It's also easy to add sub-customisations, as I show for my 'grpy', that I use to grep Python source.

Defining bash functions like this is preferable to setting GREP_OPTIONS, which will cause ALL executions of grep from your login shell to behave differently, including those invoked by programs you have run. Those programs will probably barf on the unexpectedly different behaviour of grep.

My new functions, 'grp' and 'grpy', deliberately don't shadow 'grep', so that I can still use the original behaviour any time I need that.

Perimeter answered 4/10, 2011 at 11:8 Comment(11)
Arf. It has just occurred to me that if you rename my two line script above as 'ack', it could form the next commit to the ack project's source.Perimeter
Your two-line script doesn't handle shebang lines for detecting filetypes, nor does it take advantage of Perl's regular expression engine and the --output flag, nor does it stop at one hit with -1, etc, etc. You might not use these features, but it's not fair to handwave "this grep script is the same as ack", because they're not.Claimant
Hey Andy. I confess I was exaggerating a trifle for comedy effect, and I apologise if that's inflammatory. But my approach was inspired directly by Ack's own "better than grep" self-promotion, which shamelessly misrepresents and omits salient detail in order to make grep look bad. Two can play at that game. If 'ack' really is better, than it ought to helpfully promote an honest comparison, instead of using misrepresentation to fragment communities by causing people to abandon perfectly good alternatives like grep.Perimeter
Not as inflammatory as calling ack "a massive waste of time" (daniel.hahler.de/…) Nowhere do I "misrepresent" or "make grep look bad". If I have, point me to it so I can fix it. I want people to use the best tool possible. Plenty of times, including here on SO, I've told people "don't use ack in this case, use grep." I'm all for comparison: betterthangrep.com/more-tools . If you have input re: that page, I welcome it. I don't see this as a game. I just want to wave the flag that there are options besides grep.Claimant
Alright I give in, I feel bad for the mean things I said. But the reason I was so grumpy was that, last time I read it, the 'betterthangrep' home page used to list something like "10 reasons to use ack instead of grep", which was very misleading because grep also does many of the things on the list. I see that list is now titled "10 reasons to use ack" which is somewhat mollifying, but I still know people who misinterpreted it to mean "things grep doesn't do". Regarding the "more-tools" page, the feedback I'd give is that it could include grep as an alternative, perhaps with a couple of...Perimeter
...lines suggesting how to configure a wrapper function in .bashrc to set some defaults, such as "-rI --color --exclude-dir=\.git --exclude=tags". And when you make claims like "ack is fast", it is perhaps worth mentioning somewhere that, properly configured to skip the same files, grep is (last time I measured it) actually faster. Thank you for being reasonable even though I was so mean.Perimeter
Would you have the time/inclination to write up something for betterthangrep.com that I could turn into a page, or at least a section on the "more-tools" page? I like the idea of a list "if you want to stick with grep, here are tweaks you can use." The website repo is at github.com/petdance/betterthangrep and you could fork it, or put it into an issue. Or heck, just mail me at andy-at-petdance.com and I'll take that. I could use what you've got in your comments above, but I figure there's probably more you'd add.Claimant
Oh alright then, how could I refuse such a gracious invitation?Perimeter
I think ack is a good idea, but I think it would be better if it used grep to do searching, since that is what grep does best, and arguably faster/more efficiently/more accurately than anything out there. Perhaps you shouldn't focus on the search aspect so much, but figuring out ways to meta-analyze data you get from grep? I'm gathering that ack has perl's extended regular expressions, which seems to be a draw for many? I can't think of anything else that ack can do that grep can't do, admittedly grep will typically take a lot more configuration and longer lines.Ulphiah
The link to superuser is broken.Commentator
@AndyLester: I too found ack's hype to be seriously annoying (and misleading, and confusing). It should be more clear up front that ack's main advantage over grep is that the list of files to be searched need not be specified explicitly; I found no time difference between grep login **/*.py and ack --py login (for example) -- in fact, grep was consistently faster. That was very confusing in light of the hype that was on the main page.Regnant

© 2022 - 2024 — McMap. All rights reserved.