git log history simplification
Asked Answered
M

3

19

Let's say I have the following history

        D---E-------F
       /     \       \
      B---C---G---H---I---J
     /                     \
    A-------K---------------L--M

git log --ancestry-path D..M will give me

            E-------F
             \       \
              G---H---I---J
                           \
                            L--M

However, I would like just the following

            E
             \       
              G---H---I---J
                           \
                            L--M

Or

            E-------F
                     \
                      I---J
                           \
                            L--M

Essentially, I would like to traverse down only one path, not two.

Is this possible? And if so, what is the command?

Edit:

I've tried using --first-parent, but this isn't exactly it. git log --first-parent G..M gives me

                    F
                     \
                  H---I---J
                           \
                            L--M

It includes F, because F is the first parent of I. Instead I'd like

                  H---I---J
                           \
                            L--M

Any help would be appreciated

Solution (that worked for me):

As @VonC stated, there isn't a single one-liner that does this. So I ended up using a bash script.

  1. For each commit in 'git log --ancestry-path G..M'
  2. Determine if $commit's parent includes the commit we were previously on
  3. If yes, continue. do something interesting.
  4. If no, skip that commit.

For example, git log --first-commit G..M is

H - F - I - J - L - M

However, F's parent is E, not H. So we omit F, giving me

H - I - J - L - M

Yay!

Madwort answered 18/7, 2011 at 22:19 Comment(0)
D
7

I don't think this is directly possible (unless you know in advance the exact list to include/exclude, which negates the purpose of walking the DAG)

Actually, the OP Ken Hirakawa managed to get the expected linear history by:

git log --pretty=format:"%h%n" --ancestry-path --reverse $prev_commit..$end_commit

And for each commit, making sure it is a direct child of the previous commit.

Here is the script writtten by Ken Hirakawa.


Here is my script to create the DAG mentioned in the History Simplification section of the git log man page, for --ancestry-path:

You will find at the end the bash script I used to create a similar history (call it with the name of the root dir, and your username).

I define:

$ git config --global alias.lgg "log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %C(bold blue)<%an>%Creset' --abbrev-commit --date=relative"

I get:

$ git lgg
* d7c4459 - (HEAD, M, fromA) M <VonC>
*   82b011d - (L) Merge commit 'J' into fromA <VonC>
|\
| * 190265b - (J, master) J <VonC>
| *   ef8e325 - (I) Merge commit 'F' <VonC>
| |\
| | * 4b6d976 - (F, fromB) F <VonC>
| * | 45a5d4d - (H) H <VonC>
| * |   834b239 - (G) Merge commit 'E' <VonC>
| |\ \
| | |/
| | * f8e9272 - (E) E <VonC>
| | * 96b5538 - (D) D <VonC>
| * | 49eff7f - (C) C <VonC>
| |/
| * 02c3ef4 - (B) B <VonC>
* | c0d9e1e - (K) K <VonC>
|/
* 6530d79 - (A) A <VonC>

From there, I cannot exclude one of the parents of commit I.

The ancestry-path does return:

$ git lgg --ancestry-path D..M
* d7c4459 - (HEAD, M, fromA) M <VonC>
* 82b011d - (L) Merge commit 'J' into fromA <VonC>
* 190265b - (J, master) J <VonC>
*   ef8e325 - (I) Merge commit 'F' <VonC>
|\
| * 4b6d976 - (F, fromB) F <VonC>
* | 45a5d4d - (H) H <VonC>
* | 834b239 - (G) Merge commit 'E' <VonC>
|/
* f8e9272 - (E) E <VonC>

which is consistent with the log man page:

A regular D..M computes the set of commits that are ancestors of M, but excludes the ones that are ancestors of D.
This is useful to see what happened to the history leading to M since D, in the sense that "what does M have that did not exist in D".
The result in this example would be all the commits, except A and B (and D itself, of course).

When we want to find out what commits in M are contaminated with the bug introduced by D and need fixing, however, we might want to view only the subset of D..M that are actually descendants of D, i.e. excluding C and K.
This is exactly what the --ancestry-path option does.


#!/bin/bash

function makeCommit() {
  local letter=$1
  if [[ `git tag -l $letter` == "" ]] ; then
    echo $letter > $root/$letter
    git add .
    git commit -m "${letter}"
    git tag -m "${letter}" $letter
  else
    echo "commit $letter already there"
  fi
}

function makeMerge() {
  local letter=$1
  local from=$2
  if [[ `git tag -l $letter` == "" ]] ; then
    git merge $from
    git tag -m "${letter}" $letter
  else
    echo "merge $letter already done"
  fi
}

function makeBranch() {
  local branch=$1
  local from=$2
  if [[ "$(git branch|grep $1)" == "" ]] ; then
    git checkout -b $branch $from
  else
    echo "branch $branch already created"
    git checkout $branch
  fi
}

root=$1
user=$2
if [[ ! -e $root/.git ]] ; then
  git init $root
fi
export GIT_WORK_TREE="./$root"
export GIT_DIR="./$root/.git"
git config --local user.name $2

makeCommit "A"
makeCommit "B"
makeCommit "C"
makeBranch "fromB" "B"
makeCommit "D"
makeCommit "E"
makeCommit "F"
git checkout master
makeMerge "G" "E"
makeCommit "H"
makeMerge "I" "F"
makeCommit "J"
makeBranch "fromA" "A"
makeCommit "K"
makeMerge "L" "J"
makeCommit "M"
Drapery answered 19/7, 2011 at 8:24 Comment(6)
I don't think he's going to like the answer, but there is a fundamental point that needs to be understood: a linear log doesn't really work with git because it's not a linear development. This is unfortunate for people that want to know all the changes that went into a branch, like for example when you attempt to produce a ChangeLog file. Producing ChangeLog files from git with lots of merging really doesn't work well at all, because a ChangeLog file is a linear history, and the development wasn't.Solidary
@Wes: yes, Git is a content manager, and if the content (from commits accessible between two commits) is the result of a merge, I would be surprise if a log, walking back the DAG, could ignore part if the history which contributed to said content.Drapery
I totally agree that both of you, but I guess not having a one-liner for this isn't something to be surprised of. I'm going to accept VonC's answer because it answers my question, but I did edit the question to include the solution I came up with. Please take a lookMadwort
@Ken: interesting. Do you happen to have this script published? (as a gist for instance: gist.github.com)Drapery
I'm not the best at bash scripting, but here it is gist.github.com/1096180Madwort
@Ken: excellent. I have including this link in my answer for more visibility.Drapery
E
2

I have to admit I didn't understand your solution - it didn't work for my example - but if I understood your use-case correctly (given a pair of commits, you want an arbitrary linear path between them, with no splits), I have the same problem, and the following solution seems to work:

  • Run the log with --ancestry-path, and making sure you take note of the children of each commit
  • Iterate through the results, keeping track of the "last child accepted", and updating it every time a commit references an accepted child (or there is no accepted child yet - initial case)
  • Actually print the resulting "accepted" entries in some useful way

A resulting script looks like:

#!/bin/bash
output_set=""; child_to_match=""; # init
while read -r; do
  if { [ -n "$REPLY" ]; } && { [[ "${REPLY:41}" =~ "$child_to_match" ]] || [ -z "$child_to_match" ]; }; then
    child_to_match=${REPLY:0:40}
    output_set="$output_set $child_to_match"
  fi
done <<<  "$(git rev-list --ancestry-path --children $1)"
if [[ -n $output_set ]]; then
  git show -s $output_set "${@:2}"
fi

It can be called like single-ancestry-path.sh RANGE_EXPRESSION DECORATION_ARGS, supporting generally the same decoration arguments as git log (it is in fact git show, being called once per commit), so taking the famous lg2 example from https://mcmap.net/q/12492/-pretty-git-branch-graphs, the call might look like this: eg:

single-ancestry-path.sh master..MyBranch --abbrev-commit --decorate --format=format:'%C(bold blue)%h%C(reset) - %C(bold cyan)%aD%C(reset) %C(bold green)(%ar)%C(reset)%C(bold yellow)%d%C(reset)%n''          %C(white)%s%C(reset) %C(dim white)- %an%C(reset)'

It's been 9 years, so I would have hoped there would be an easier answer, but I can't find one.

Exigent answered 4/12, 2020 at 19:14 Comment(0)
A
0

I too dislike the problems that result from merging and have dispensed with having it in my mainstream history. Whenever there is a large merge onto a main branch I will recommit it with identical contents but as a single commit.

    D---E--------F                           Co-Developer
   /               
  B---C---G'---H---I'--J                     Team Leader
 /                       
A-------K----------------L'--M               Main Stream

Here, G', I' and L' would be points where I have re-commited merge results. The branch descriptions simply describe a scenario where I can visualize the problem tree occurring. So the contents of G and G' (similarly I and I') would be the same, the team leader having merged in the work-to-date of the developer. And L' the same as L, the feature integrated onto the mainstream.

I totally understand that avoiding a problem is not the same as solving it, and sympathize with those facing the problem now.

Arborescent answered 16/4, 2019 at 14:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.