This is useful in combination with BFG (Git filter branch - not to be confused with git-filter-branch) and git-filter-repo. It just gets the file paths so that you can feed them into one of the two tools I just mentioned.
A. Relative, unique, sorted, paths:
# Get all unique filepaths of files matching 'password'
# Source: https://mcmap.net/q/11999/-how-to-grep-git-commits-for-a-certain-word
git rev-list --all | (
while read revision; do
git grep -F --files-with-matches 'password' $revision | cat | sed "s/[^:]*://"
done
) | sort | uniq
B. Unique, sorted, filenames (not paths):
# Get all unique filenames matching 'password'
# Source: https://mcmap.net/q/11999/-how-to-grep-git-commits-for-a-certain-word
git rev-list --all | (
while read revision; do
git grep -F --files-with-matches 'password' $revision | cat | sed "s/[^:]*://"
done
) | xargs basename | sort | uniq
This second command is useful for BFG, because it only accept file names and not repo-relative/system-absolute paths.
There you go. Enjoy using these Bash snippets for as much agony as they caused to me. I hate Bash, so why do I keep using it?
Dissection
Get file names/paths only
Any of the following options mean the same (git-rep documentation):
-l
--files-with-matches
--name-only
Instead of showing every matched line, show only the names of files that contain
Blockquote
Is your pattern: A. Regex v.s. B. Fixed String?
As for -F
, well, it just means use a fixed string instead a regex for pattern interpretation. A source.
Another useful note that belongs here: You can throw in -i
or --ignore-case
to be case insensitive.
Get rid of that stupid leading commit hash
sed "s/[^:]*://"
Source.
Get them unique paths!
| sort | uniq
Who wants duplicate paths? Not you, not me! Oh hey look, they are sorted too! Enjoy.
Source: me. I have used this for as long as I can remember.
(man sort
and man uniq
)
What about file names without paths?
xargs basename
You would think | basename
would work, but no. It does not accept input standard input, but as command line arguments. Here's an explanation for that. Go figure! basename
basically returns the stem filename without its leading path. man basename
.
For method A., I want absolute paths not relative.
Sure, just slap a realpath
at the end. Like so:
) | sort | uniq | xargs realpath
Of course you have to use xargs
because realpath
does not use standard input for input. It uses command-line arguments. Just like dirname
.
Inspirations