find the first or last commit a patch applies to
Asked Answered
E

2

7

Assuming a patch was created from a specific commit in the past, and no longer applies to HEAD.

How can I find either the first or better the last commit in the history of HEAD, where this patch applies with "git apply" ? Maybe something with git bisect? But which command will tell me if a patch applies?

Ideally I want to move back to that commit, apply the patch, then rebase on or merge with the original HEAD, and diff again to create a new patch, unless there was a conflict. After this I would like to go back to the original HEAD, so I can continue with more patches.

Background: There is a number of patches that need to be rerolled... (and yes, there are ecosystems where patches are still a thing..)

Era answered 28/5, 2014 at 5:19 Comment(16)
Where the patches created with git diff, or git format-patch?Optative
if the git bisect is going to be too difficult, maybe I could use the patch file created timestamp to find the commit to start with.. but ideally i want to know how it works with bisect :)Era
found this: #9367039 Now we only need to wrap that further so it can be automated to reroll 100 patches.Era
Well darn, I was just about to finish working on a Bash script too :POptative
go ahead! the link is nice, but it could have some more handholding for people who use git bisect for the first time :)Era
So wait, there's a problem, git bisect requires that you pass it one "good revision", which in this case is a revision where the patch applies without conflicts. git bisect won't always be a good tool for this because of that, I suspect.Optative
I'm exhausted, I'm calling it a night. The answer you found was way better than any Bash script I would have made anyways.Optative
ah crap. and now that i think about it, it does make sense.Era
so maybe better a loop to search backwards. This can be expensive. But, maybe out of a 100 commit history, there are only 5 commits where the patch applies, e.g. commits 50 - 54. The only way to find these would be backwards search from commit 100.Era
Anyway, thanks for your help. Maybe i am going to do try something which uses the commit file timestamp. If I get something useful I will post it.Era
And btw, here is where I am coming from: drupal.org/node/2247991#comment-8819487 I have not seen xjm's script yet, but wanted to see how I would do it.Era
You know you can also use git log -S <search-string> or git log -G <regex> to find the first addition/deletion of a line <search-string> or <regex> too, right? It might help you find possible candidates for "the first commit" where a patch might have been generated from.Optative
yes, but i'd say this is more a heuristic than something reliable enough for automation..Era
Are you one of the core maintainers? If you are, have you considered using git format-patch to generate patches? The patches themselves will record which commit they come from.Optative
I am an occasional contributor, but not a maintainer. And I like format-patch, but some reviewers don't..Era
Okay, well I'm off to bed. Maybe someone else will come along and have a better idea. Good luck!Optative
O
2

This answer assumes that the patches were created with git diff, and not git format-patch, and that your default pager for your git log is less.

Here is an example of a patch created from git diff <sha1> <sha2>,

diff --git a/osx/.bash_profile b/osx/.bash_profile
index c7b41df..fb80367 100644
--- a/osx/.bash_profile
+++ b/osx/.bash_profile
@@ -3,6 +3,10 @@
 # Setup PATH for Homebrew packages
 export PATH=/usr/local/bin:$PATH

+# Setup Scala variables
+export SCALA_HOME=/usr/local/Frameworks/scala # Symlinked directory
+export PATH=$PATH:$SCALA_HOME/bin
+
 # Initialize rbenv,
 # https://github.com/sstephenson/rbenv#homebrew-on-mac-os-x
 eval "$(rbenv init -)"

Take this line:

+export SCALA_HOME=/usr/local/Frameworks/scala # Symlinked directory

and search for it in git log --patch or git log -p. Type / when in less, then enter the regex you want to search for:

/\+export SCALA_HOME=/usr/local/Frameworks/scala # Symlinked directory

The + is escaped with \ here, because it's a special character in regexes. Hit enter to find the first match, and n to bring up the next match, or N to go to the previous match.

This will help you find commits that might be possible candidates for where the patch came from. You can also use spacebar in less to page down, and b to page up.

Optative answered 28/5, 2014 at 6:7 Comment(6)
Hm. this can be useful, but it is far from automatically finding the correct commit, so i could bulk reroll patches..Era
Also I wonder why a line from the patch should be anywhere in the log? E.g. if the patch adds "vanilla ice", but the history of HEAD has no such a thing?Era
@Era working on a better solution with git bisect, give me about 10-15 minutes. Also, have you seen what git log -p does? It adds diff information to the logs.Optative
git apply --check looks promisingEra
so, git apply --check prints stuff ("error ...") if it fails, but prints nothing if the patch applies cleanly. Now this needs to be combined with git bisect..Era
yes, git log -p is fun, and i did not know it before. thanks!Era
M
1

The way git wants you to do this

git apply --3way should locate the base versions of each file using blob hashes and merge forward all in one step, assuming they exist somewhere in your repo history and you can deal with the merge conflicts. That's possibly an easier solution for many people.

The way to do what you asked for

If you still really want to know a historical commit that contains the base files a diff came from, my script below expands one of the solutions to locate commits containing a single blob hash to try and find commits containing a group of blob hashes pulled from a patch file.

#!/bin/sh
# git-find-patch-base takes a patch produced by "git diff" and tries to locate commit(s)
# containing all source blobs

# The first parameter is the name of the patch file to examine
patch_file="$1"
# Any remaining parameters are passed as a group to the git log command using $@ below
shift

# Make a temporary file and capture a list of all the starting
# file blob hashes that the patch used in it. Note: Adding a file shows
# a starting hash of 00000000, so we filter that one out...
tmp_blob_file=$(mktemp)
echo "Examining patch file \"$patch_file\"..." 1>&2
grep -E "^index" "$patch_file" | colrm 1 6 | colrm 10 | sort | uniq | grep -v 00000000 > "$tmp_blob_file"

# Count how many unique blob hashes we identified
blobcount=$(cat "$tmp_blob_file" | wc -l)
echo "Found $blobcount unique blob hashes in patch..." 1>&2

# Use git log to get a list of commits to check against. Then, for
# each of those commits, count how many of the blob hashes that we
# wanted appear in it, and output the commit hash if it's at least the
# ideal blob count. Note: this is an imperfect searching method, since
# there is a chance for hash collision, exacerbated since the grep is not
# forcing the short hashes to only match the beginning of the long
# hashes.
echo "Searching log/tree history of git..." 1>&2
git log "$@" --pretty=format:'%T %h %s' \
| while read tree commit subject ; do
    if test $(git ls-tree -r "$tree" | grep -f "$tmp_blob_file" | wc -l) -ge "$blobcount" ; then
        echo "$commit" "$subject"
        break
    fi
done

# Clean up the temporary file we made...
rm "$tmp_blob_file"

The first parameter is the name of the patch file to analyze, and any remaining parameters are passed to git log to help expand/restrict the list of commits to check against. If you want the first commit relative to a specific branch, you can run git-find-patch-base foo.patch branchname. If you're completely lost as to where something is from, you can run git-find-patch-base foo.patch --all and go get some coffee while it does it's thing. There are a lot of useful limiters on git log like --grep or --author that can speed up this process.

The script as shown stops on the first match with the break out of the while loop. You can remove that and it will exhaustively search all the way back spitting out all candidate commits.

Maquis answered 21/6, 2018 at 20:10 Comment(1)
It just occurred to me that the imperfections of the search method could be eliminated by doing git apply --check against the commit, but that would also require a git checkout in the script to do so...Maquis

© 2022 - 2024 — McMap. All rights reserved.