Git: discover which commits ever touched a range of lines
Asked Answered
W

6

72

I'm having trouble figuring out how to use git blame for getting the set of commits that ever touched a given range of lines. There are similar questions like this one but the accepted answer doesn't bring me much further.

Let's say I have a definition that starts on line 1000 of foo.rb. It's only only 5 lines long, but the number of commits that ever changed those lines is enormous. If I do

git blame foo.rb -L 1000,+5

I get references to (at most) five distinct commits that changed these lines, but I'm also interested in the commits "behind them".

Similarly,

git rev-list HEAD -- foo.rb | xargs git log --oneline

is almost what I want, but I can't specify line ranges to git rev-list

Can I pass a flag to git blame to get the list of commits that ever touched those five lines, or what's the quickest way to build a script that extracts such information? Let's ignore for the moment the possibility that the definition once had more or less than 5 lines.

Wulfe answered 3/1, 2013 at 16:7 Comment(4)
Are you sure this is what you want? Identifying changes with line numbers only works for a given state of the file. If you want lines 15 - 20 for commit 12345 the code on those lines might be on lines 55 - 60 for commit 12345^.Ersatz
Pretty sure. This is why I need to write a script that identifies that as well that. Still assume, for simplicity sake, that the definition has never moved in the file from the initial commit in the repo.Wulfe
possible duplicate of Retrieve the commit log for a specific line in a file?Fulgurite
I believe this is a duplicate of stackoverflow.com/questions/8435343/…Fulgurite
L
76

Since Git 1.8.4, git log has -L to view the evolution of a range of lines.

For example, suppose you look at git blame's output:

((aa27064...))[mlm@macbook:~/w/mlm/git]
$ git blame -L150,+11 -- git-web--browse.sh
a180055a git-web--browse.sh (Giuseppe Bilotta 2010-12-03 17:47:36 +0100 150)            die "The browser $browser is not
a180055a git-web--browse.sh (Giuseppe Bilotta 2010-12-03 17:47:36 +0100 151)    fi
5d6491c7 git-browse-help.sh (Christian Couder 2007-12-02 06:07:55 +0100 152) fi
5d6491c7 git-browse-help.sh (Christian Couder 2007-12-02 06:07:55 +0100 153) 
5d6491c7 git-browse-help.sh (Christian Couder 2007-12-02 06:07:55 +0100 154) case "$browser" in
81f42f11 git-web--browse.sh (Giuseppe Bilotta 2010-12-03 17:47:38 +0100 155) firefox|iceweasel|seamonkey|iceape)
5d6491c7 git-browse-help.sh (Christian Couder 2007-12-02 06:07:55 +0100 156)    # Check version because firefox < 2.0 do
5d6491c7 git-browse-help.sh (Christian Couder 2007-12-02 06:07:55 +0100 157)    vers=$(expr "$($browser_path -version)" 
5d6491c7 git-browse-help.sh (Christian Couder 2007-12-02 06:07:55 +0100 158)    NEWTAB='-new-tab'
5d6491c7 git-browse-help.sh (Christian Couder 2007-12-02 06:07:55 +0100 159)    test "$vers" -lt 2 && NEWTAB=''
a0685a4f git-web--browse.sh (Dmitry Potapov   2008-02-09 23:22:22 -0800 160)    "$browser_path" $NEWTAB "$@" &

And you want to know the history of what is now line 155.

Then:

((aa27064...))[mlm@macbook:~/w/mlm/git]
$ git log --topo-order --graph -u -L 155,155:git-web--browse.sh
* commit 81f42f11496b9117273939c98d270af273c8a463
| Author: Giuseppe Bilotta <[email protected]>
| Date:   Fri Dec 3 17:47:38 2010 +0100
| 
|     web--browse: support opera, seamonkey and elinks
|     
|     The list of supported browsers is also updated in the documentation.
|     
|     Signed-off-by: Giuseppe Bilotta <[email protected]>
|     Signed-off-by: Junio C Hamano <[email protected]>
| 
| diff --git a/git-web--browse.sh b/git-web--browse.sh
| --- a/git-web--browse.sh
| +++ b/git-web--browse.sh
| @@ -143,1 +143,1 @@
| -firefox|iceweasel)
| +firefox|iceweasel|seamonkey|iceape)
|  
* commit a180055a47c6793eaaba6289f623cff32644215b
| Author: Giuseppe Bilotta <[email protected]>
| Date:   Fri Dec 3 17:47:36 2010 +0100
| 
|     web--browse: coding style
|     
|     Retab and deindent choices in case statements.
|     
|     Signed-off-by: Giuseppe Bilotta <[email protected]>
|     Signed-off-by: Junio C Hamano <[email protected]>
| 
| diff --git a/git-web--browse.sh b/git-web--browse.sh
| --- a/git-web--browse.sh
| +++ b/git-web--browse.sh
| @@ -142,1 +142,1 @@
| -    firefox|iceweasel)
| +firefox|iceweasel)
|  
* commit 5884f1fe96b33d9666a78e660042b1e3e5f9f4d9
  Author: Christian Couder <[email protected]>
  Date:   Sat Feb 2 07:32:53 2008 +0100

      Rename 'git-help--browse.sh' to 'git-web--browse.sh'.

      Signed-off-by: Christian Couder <[email protected]>
      Signed-off-by: Junio C Hamano <[email protected]>

  diff --git a/git-web--browse.sh b/git-web--browse.sh
  --- /dev/null
  +++ b/git-web--browse.sh
  @@ -0,0 +127,1 @@
  +    firefox|iceweasel)

If you use this functionality frequently, you might find a git alias useful. To do that, put in your ~/.gitconfig:

[alias]
    # Follow evolution of certain lines in a file
    # arg1=file, arg2=first line, arg3=last line or blank for just the first line
    follow = "!sh -c 'git log --topo-order -u -L $2,${3:-$2}:"$1"'" -

And now you can just do git follow git-web--browse.sh 155.

Logger answered 3/11, 2013 at 20:3 Comment(1)
What last - sign in your alias mean?Tamarisk
B
24

I think this is what you want:

git rev-list HEAD -- foo.rb | ( 
    while read rev; do
        git blame -l -L 1000,+5 $rev -- foo.rb | cut -d ' ' -f 1
    done;
) | awk '{ if (!h[$0]) { print $0; h[$0]=1 } }'

That'll output the the revision number for each commit that has an edit to the lines you've chosen.

Here are the steps:

  1. The first part git rev-list HEAD -- foo.rb outputs all revisions in which the chosen file is edited.

  2. Each of those revisions then goes into the second part, which takes each one and puts it into git blame -l -L 1000,+5 $rev -- foo.rb | cut -d ' ' -f 1. This is a two-part command.

    1. git blame -l -L 1000,+5 $rev -- foo.rb outputs the blame for the chosen lines. By feeding it the revision number, we are telling it to start from that commit and go from there, rather than starting at the head.
    2. Since blame outputs a bunch of info we don't need, cut -d ' ' -f 1 gives us the first column (the revision number) of the blame output.
  3. awk '{ if (!h[$0]) { print $0; h[$0]=1 } }' takes out non-adjacent duplicate lines while maintaining the order they appeared in. See http://jeetworks.org/node/94 for more info about this command.

You could add a last step here to get prettier output. Pipe everything into xargs -L 1 git log --oneline -1 and get the corresponding commit message for the list of revisions. I had a weird issue using this last step where I had to keep pressing next every few revisions that were output. I'm not sure why that was, which is why I didn't include it in my solution.

Buckling answered 13/1, 2013 at 4:12 Comment(2)
Congratulations! Very nice and concise, the next steps would be to calculate the updated line range automatically. But this is a great start. Would you be interested in solving the next puzzle? Shall I open another bounty? :-)Wulfe
That's a very interesting question! I'm, unfortunately, very busy at work this week, so I won't have a chance to play with this. It'll be on my mind, though, and I'll come back to it next week if someone else hasn't solved it by then.Buckling
C
12

Not sure what you want to do, but maybe git log -S can do the trick for you:

-S<string>
    Look for differences that introduce or remove an instance of <string>. 
    Note that this is different than the string simply appearing
    in diff output; see the pickaxe entry in gitdiffcore(7) for more
    details.

You can put in string the change (or part of the change) you are trying to follow and this will list the commits that ever touched this change.

Cassilda answered 7/1, 2013 at 14:37 Comment(2)
Sorry, that's not what I'm after at all. but +1 for trying anywayWulfe
+1 because it helps googlers with similar issue as this question titleSmithson
I
1

I liked this puzzle, it's got its subtleties. Source this file, say init foo.rb 1000,1005 and follow the instructions. When you're done, file @changes will have the correct list of commits in topological order and @blames will have the actual blame output from each.

This is dramatically more complex than the accepted solution above. It produces output that will sometimes be more useful, and hard to reproduce, and it was fun to code.

The problem with trying to track line-number ranges automatically while stepping backward through history is if a change hunk crosses line-numbered range boundaries you can't automatically determine where in that hunk the new range boundary should be, and you'll either have to include a big range for big additions and so accumulate (sometimes lots of) irrelevant changes, or drop into manual mode to be sure it's right (which of course gets you right back here), or accept extreme lossage at times.

If you want your output to be exact, use the answer above with trustworthy regex ranges like `/^type function(/,/^}/', or use this, which isn't actually that bad, a couple seconds per step back in time.

In exchange for the extra complexity, it does produces the hitlist in topological sequence and it does at least (fairly successfully) try to ameliorate the pain at each step. It never runs a redundant blame, for instance, and update-ranges makes adjusting line numbers easier. And of course there's the reliability of having had to individually eyeball the hunks... :-P

To run this on full auto, say { init foo.rb /^class foo/,/^end/; auto; } 2>&-

 ### functions here create random @-prefix files in the current directory ###
#
# git blame history for a range, finding every change to that range
# throughout the available history.  It's somewhat, ahh, "intended for
# customization", is that enough of a warning?  It works as advertised
# but drops @-prefix temporary files in your current directory and
# defines new commands
#
# Source this file in a subshell, it defines functions for your use.
# If you have @-prefix files you care about, change all @ in this file
# to something you don't have and source it again.
#
#    init path/to/file [<start>,<end>]  # range optional
#    update-ranges           # check range boundaries for the next step
#    cycle [<start>,<end>]   # range unchanged if not supplied
#    prettyblame             # pretty colors, 
#       blue="child commit doesn't have this line"
#       green="parent commit doesn't have this line"
#           brown=both
#    shhh # silence the pre-cycle blurb
#
# For regex ranges, you can _usually_ source this file and say `init
# path/to/file /startpattern/,/endpattern/` and then cycle until it says 0
# commits remain in the checklist
#
# for line-number ranges, or regex ranges you think might be unworthy, you
# need to check and possibly update the range before each cycle.  File
# @next is the next blame start-point revision text; and command
# update-ranges will bring up vim with the current range V-selected.  If
# that looks good, `@M` is set up to quit even while selecting, so `@M` and
# cycle.  If it doesn't look good, 'o' and the arrow keys will make getting
# good line numbers easy, or you can find better regex's.  Either way, `@M`
# out and say `cycle <start>,<end>` to update the ranges.

init () { 
    file=$1;
    range="$2"
    rm -f @changes
    git rev-list --topo-order HEAD -- "$file" \
    | tee @checklist \
    | cat -n | sort -k2 > @sequence
    git blame "-ln${range:+L$range}" -- "$file" > @latest || echo >@checklist
    check-cycle
    cp @latest @blames
}

update-latest-checklist() {
    # update $latest with the latest sha that actually touched our range,
    # and delete that and everything later than that from the checklist.
    latest=$(
        sed s,^^,, @latest \
        | sort -uk1,1 \
        | join -1 2 -o1.1,1.2 @sequence - \
        | sort -unk1,1 \
        | sed 1q \
        | cut -d" " -f2
    )
    sed -i 1,/^$latest/d @checklist
}
shhh () { shhh=1; }

check-cycle () {
    update-latest-checklist
    sed -n q1 @checklist || git log $latest~..$latest --format=%H\ %s | tee -a @changes
    next=`sed 1q @checklist`
    git cat-file -p `git rev-parse $next:"$file"` > @next
    test -z "$shh$shhh$shhhh" && {
        echo "A blame from the (next-)most recent alteration (id `git rev-parse --short $latest`) to '$file'"
        echo is in file @latest, save its contents where you like
        echo 
        echo you will need to look in file @next to determine the correct next range,
        echo and say '`cycle its-start-line,its-end-line`' to continue
        echo the "update-ranges" function starts you out with the range selected
    } >&2
    ncommits=`wc -l @checklist | cut -d\  -f1`
    echo  $ncommits commits remain in the checklist >&2
    return $((ncommits==0))
}

update-ranges () {
    start="${range%,*}"
    end="${range#*,}"
    case "$start" in
    */*)    startcmd="1G$start"$'\n' ;;
    *)      startcmd="${start}G" ;;
    esac
    case "$end" in
    */*)    endcmd="$end"$'\n' ;;
    [0-9]*) endcmd="${end}G" ;;
    +[0-9]*) endcmd="${end}j" ;;
    *) endcmd="echohl Search|echo "can\'t" get to '${end}'\"|echohl None" ;;
    esac
    vim -c 'set buftype=nofile|let @m=":|q'$'\n"' -c "norm!${startcmd}V${endcmd}z.o" @next
}

cycle () {
    sed -n q1 @checklist && { echo "No more commits to check"; return 1; }
    range="${1:-$range}"
    git blame "-ln${range:+L$range}" $next -- "$file" >@latest || echo >@checklist
    echo >>@blames
    cat @latest >>@blames
    check-cycle
}

auto () {
    while cycle; do true; done
}

prettyblames () {
cat >@pretty <<-\EOD
BEGIN {
    RS=""
    colors[0]="\033[0;30m"
    colors[1]="\033[0;34m"
    colors[2]="\033[0;32m"
    colors[3]="\033[0;33m"
    getline commits < "@changes"
    split(commits,commit,/\n/)
}
NR!=1 { print "" }
{
    thiscommit=gensub(/ .*/,"",1,commit[NR])
    printf "%s\n","\033[0;31m"commit[NR]"\033[0m"
    split($0,line,/\n/)
    for ( n=1; n<=length(line); ++n ) {
        color=0
        split(line[n],key,/[1-9][0-9]*)/)
        if ( NR!=1 && !seen[key[1]] ) color+=1
        seen[key[1]]=1;
        linecommit = gensub(/ .*/,"",1,line[n])
        if (linecommit==thiscommit) color+=2
        printf "%s%s\033[0m\n",colors[color],line[n]
    }
}
EOD
awk -f @pretty @blames | less -R
}
Incoherent answered 13/1, 2013 at 8:58 Comment(3)
I think this is it, but have to test, since you provided no example. Hope you can get the bounty, but it's ending pretty soon and there's an answer voted 3 (altough it doesn't answer the challenge at all!)Wulfe
Sorry, just checked it and it's not really automatic and it depends on vim I'll go with a simpler answer above which doesn't take varying line numbers in consideration but is much simpler and works perfectly well for the problem statement.Wulfe
@JoaoTavora On a second look the manual checklist update step above (and all the complexity) is useless, the initial checklist is already correct. The answer I get after correcting for that looks a lot like his, except for allowing for tracking the drift. It turns out you can do a fairly useful job automating that tracking, but the right answer there is to just use regexp boundaries -- line-number-based tracking comes unmoored when additions cross the range boundary, because only regexps have any hope of automatically finding the new boundary within the added lines.Incoherent
T
1

Please refer to the answer posted here List all commits for a specific file. Its exactly what you need.

Trantham answered 13/8, 2015 at 10:0 Comment(1)
Granted, this answers my question, though I don't think it was available by the time I posted the question. Although, would it be flexible enough to track a moving language-specific construct across history. moving in the sense that starting and ending line ranges are not static.Wulfe
K
0

A few thoughts..

This sounds similar to this post, and it looks like you might get close with something like this:

git blame -L '/variable_name *= */',+1

As long as you know the definition to match against (for the regex).

There is a thread discussion here, about using tig and git gui (which apparently might handle this). I haven't tried this myself yet, so can't verify it (I'll give this a try later).

Kamkama answered 12/1, 2013 at 5:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.