How to grep Git commits for a certain word?
Asked Answered
D

9

839

In a Git code repository I want to list all commits that contain a certain word. I tried this

git log -p | grep --context=4 "word"

but it does not necessarily give me back the filename (unless it's less that five lines away from the word I searched for. I also tried

git grep "word"

but it gives me only present files and not the history.

How do I search the entire history, so I can follow changes on a particular word? I intend to search my codebase for occurrences of word to track down changes (search in files history).

Disaffirm answered 26/8, 2009 at 20:34 Comment(1)
Possible duplicate of How to grep (search) committed code in the git history?Spendable
T
1232

If you want to find all commits where the commit message contains a given word, use

$ git log --grep=word

If you want to find all commits where "word" was added or removed in the file contents (to be more exact: where the number of occurrences of "word" changed), i.e., search the commit contents, use a so-called 'pickaxe' search with

$ git log -Sword

In modern Git there is also

$ git log -Gword

to look for differences whose added or removed line matches "word" (also commit contents).

A few things to note:

  • -G by default accepts a regex, while -S accepts a string, but it can be modified to accept regexes using the --pickaxe-regex.
  • -S finds commits where the number of occurrences of "word" changed, while -G finds commits where "word" appears in the diff.
  • This means that -S<regex> --pickaxe-regex and -G<regex> do not do exactly the same thing.

The git diff documentation has a nice explanation of the difference:

To illustrate the difference between -S<regex> --pickaxe-regex and -G<regex>, consider a commit with the following diff in the same file:

+    return frotz(nitfol, two->ptr, 1, 0);
...
-    hit = frotz(nitfol, mf2.ptr, 1, 0);

While git log -G"frotz\(nitfol" will show this commit, git log -S"frotz\(nitfol" --pickaxe-regex will not (because the number of occurrences of that string did not change).

This will show the commits containing the search terms, but if you want to see the actual changes in those commits instead you can use --patch:

$ git log -G"searchTerm" --patch

This can then be piped to grep to isolate the output just to display commit diff lines with that search term. A common use-case is to display diff lines with that search term in commits since and including a given commit - 3b5ab0f2a1 in this example - like so:

$ git log 3b5ab0f2a1^.. -G"searchTerm" --patch | grep searchTerm
Thalassic answered 27/8, 2009 at 10:41 Comment(22)
-S<string> and -G<string> in man git log is so unclear. I had to read and experiment three times to catch the difference.Hellenic
@TankorSmash -S<string> Look for differences that introduce or remove an instance of <string>. -G<string> Look for differences whose added or removed line matches the given <regex>.Hellenic
@Hellenic Oh I see, a single string instance, versus an entire line! ThanksDenial
@m-ric, @TankorSmash: The difference is that -S<string> is faster because it only checks if number of occurrences of <string> changed, while -G<string> searches added and removed line in every commit diff.Enticement
@JakubNarębski I don't understand Steve's edit. I kind of like the git log --grep=word you mentioned for the commit message, as opposed to commit content.Elate
@VonC: Thanks for mentioning it. Reverted, as Steven Penny edit removed useful information.Enticement
If you need to search words with space in between,git log --grep="my words".Slivovitz
@MEM, --grep is different from -S and -G. You can quote the string to each of these arguments.Arteaga
Note that if you want to find the word in any commit, you need to also run these with "git log -g", because if you've accidentally orphaned a commit, none of the forms here will find it.Calgary
Can you then make -G accept fixed string so that you don't have to quote special characters?Inmesh
@jamo, if its a special character, it probably has to be quoted. Its very similar to how you use pass regular expressions to find or grep.Heteronym
you might also want to add --patch so that you can actually see the code changes: git log -Sword --patchTybi
@JakubNarębski I took the liberty to update the examples, but feel free to revert the edit if you disagree.Elate
git log --grep=<term> can also be case-insensitive by adding -i or --regexp-ignore-casePyelonephritis
I can't work out how to search for a term with spaces. None of these work git log -S"two words", git log -S "two words", git log -S "two\ words", git log -Stwo\ wordsSabu
Interesting that this answer to a similar question also talks about using git grep, indicating that there seem to be 2 different commands allowing you to do this. I guess using git log --grep... is the preferred way.Citrus
This answer concentrates only on git log scope. What if one needs to query the whole repository / all branches?Latticed
@Latticed - git log can search all branches with --branches...Enticement
Actually what I rather meant was git log --remotes but you directed me fruitfully. Thanks!Latticed
Git log -G is terrific, but it seems to only return the five most recent commits. What if I want more?Burlie
@WilliamJockusch Are you sure that you have more than 5 commits (starting from HEAD by default) where <pattern> matches within changes? Because I have just checked that it can return more than 5 commits. Note that by default git log uses pager, and shows first page of results then stops for user input.Enticement
Turns out I just needed the spacebar.Burlie
L
268

git log's pickaxe will find commits with changes including "word" with git log -Sword

Laky answered 26/8, 2009 at 20:46 Comment(5)
This is not entirely precise. -S<string> Look for differences that introduce or remove an instance of <string>. Note that this is different than the string simply appearing in diff output;Kehr
While this is generally the right answer, I downvoted only to encourage others to read this answer (https://mcmap.net/q/11999/-how-to-grep-git-commits-for-a-certain-word) which has 3 different ways and explains their subtleties.Wally
gosh! I don't think that's a good reason to downvote a right answer... you weren't confident including the link in a comment would be sufficient encouragement?Fidelity
@jakeonrails, That answer should have been an edit to this (older) one, so we don't have these annoying duplicates. But people only want the reputation, instead of a clean answers page.Recusant
Examples of blaming the people instead of the system. Stack Overflow should have more varied and nuanced ways to: divert attention, reward improvement, qualify and quantify, exalt the essence, clarify and drill down. And to digress without detracting, wink wink wince.Harelda
D
49

After a lot of experimentation, I can recommend the following, which shows commits that introduce or remove lines containing a given regexp, and displays the text changes in each, with colours showing words added and removed.

git log --pickaxe-regex -p --color-words -S "<regexp to search for>"

Takes a while to run though... ;-)

Directive answered 4/9, 2016 at 16:12 Comment(4)
This is one of the best so far thanks. Hint: to just list all results without paging, either prepend the command with GIT_PAGER=cat or append it with | catEntrails
Specify a path or file would be much faster git log --pickaxe-regex -p --color-words -S "<regexp to search for>" <file or fiepath>Balderas
Can this be modified to display only the lines matching the pattern, instead of the entire diff? (I found the answer here: https://mcmap.net/q/12309/-how-to-show-the-change-itself-using-git-log-pickaxe)Stomodaeum
You can add a limit to the output to prevent it from spinning out of control: git log -n 1000 --pickaxe-regex -p --color-words -S "<regexp to search for>"Lynnell
F
12

One more way/syntax to do it is: git log -S "word"
Like this you can search for example git log -S "with whitespaces and stuff @/#ü !"

Fertilization answered 11/2, 2016 at 23:3 Comment(0)
B
12

You can try the following command:

git log --patch --color=always | less +/searching_string

or using grep in the following way:

git rev-list --all | GIT_PAGER=cat xargs git grep 'search_string'

Run this command in the parent directory where you would like to search.

Brood answered 26/9, 2016 at 11:5 Comment(2)
I like this method because the commits I'm looking at have hundreds of lines of unrelated changes, and I am only interested in the actual patches that involve the word I'm searching for. To get color use git log --patch --color=always | less +/searching_string.Corpsman
To find something in the garbage commits use: git fsck | grep -Po '(?<=commit ).*' | GIT_PAGER xargs git grep 'search_string'Atomicity
G
4

To use a Boolean connector on a regular expression:

git log --grep '[0-9]*\|[a-z]*'

This regular expression searches for the regular expression [0-9]* or [a-z]* in commit messages.

Glace answered 1/4, 2015 at 12:14 Comment(0)
I
4

This is useful in combination with BFG (Git filter branch - not to be confused with git-filter-branch) and git-filter-repo. It just gets the file paths so that you can feed them into one of the two tools I just mentioned.

A. Relative, unique, sorted, paths:

# Get all unique filepaths of files matching 'password'
# Source: https://mcmap.net/q/11999/-how-to-grep-git-commits-for-a-certain-word
git rev-list --all | (
    while read revision; do
        git grep -F --files-with-matches 'password' $revision | cat | sed "s/[^:]*://"
    done
) | sort | uniq

B. Unique, sorted, filenames (not paths):

# Get all unique filenames matching 'password'
# Source: https://mcmap.net/q/11999/-how-to-grep-git-commits-for-a-certain-word
git rev-list --all | (
    while read revision; do
        git grep -F --files-with-matches 'password' $revision | cat | sed "s/[^:]*://"
    done
) | xargs basename | sort | uniq

This second command is useful for BFG, because it only accept file names and not repo-relative/system-absolute paths.

There you go. Enjoy using these Bash snippets for as much agony as they caused to me. I hate Bash, so why do I keep using it?

Dissection

Get file names/paths only

Any of the following options mean the same (git-rep documentation):

  • -l
  • --files-with-matches
  • --name-only

Instead of showing every matched line, show only the names of files that contain Blockquote

Is your pattern: A. Regex v.s. B. Fixed String?

As for -F, well, it just means use a fixed string instead a regex for pattern interpretation. A source.

Another useful note that belongs here: You can throw in -i or --ignore-case to be case insensitive.

Get rid of that stupid leading commit hash

sed "s/[^:]*://"

Source.

Get them unique paths!

| sort | uniq

Who wants duplicate paths? Not you, not me! Oh hey look, they are sorted too! Enjoy.

Source: me. I have used this for as long as I can remember. (man sort and man uniq)

What about file names without paths?

xargs basename

You would think | basename would work, but no. It does not accept input standard input, but as command line arguments. Here's an explanation for that. Go figure! basename basically returns the stem filename without its leading path. man basename.

For method A., I want absolute paths not relative.

Sure, just slap a realpath at the end. Like so:

) | sort | uniq | xargs realpath

Of course you have to use xargs because realpath does not use standard input for input. It uses command-line arguments. Just like dirname.

Inspirations

Immaterialism answered 25/10, 2021 at 21:16 Comment(1)
Thanks for the edits @Peter Mortensen! My answer is looking even crispier now, with these typos and naked URLs fixed. Your edit descriptions are on-point too as they help me avoid repeating those corrected issues.Immaterialism
A
1

vim-fugitive is versatile for that kind of examining in Vim.

Use :Ggrep to do that. For more information you can install vim-fugitive and look up the turorial by :help Grep. And this episode: exploring-the-history-of-a-git-repository will guide you to do all that.

Allbee answered 9/7, 2015 at 6:49 Comment(0)
H
0

If you want search for sensitive data in order to remove it from your Git history (which is the reason why I landed here), there are tools for that. GitHub as a dedicated help page for that issue.

Here is the gist of the article:

The BFG Repo-Cleaner is a faster, simpler alternative to git filter-branch for removing unwanted data. For example, to remove your file with sensitive data and leave your latest commit untouched), run:

bfg --delete-files YOUR-FILE-WITH-SENSITIVE-DATA

To replace all text listed in passwords.txt wherever it can be found in your repository's history, run:

bfg --replace-text passwords.txt

See the BFG Repo-Cleaner's documentation for full usage and download instructions.

Herbal answered 27/5, 2018 at 20:33 Comment(1)
You might wanna add this answer to stackoverflow.com/questions/872565/… instead of hereSturgill

© 2022 - 2024 — McMap. All rights reserved.