Finding a branch point with Git?
Asked Answered
F

27

593

I have a repository with branches master and A and lots of merge activity between the two. How can I find the commit in my repository when branch A was created based on master?

My repository basically looks like this:

-- X -- A -- B -- C -- D -- F  (master) 
          \     /   \     /
           \   /     \   /
             G -- H -- I -- J  (branch A)

I'm looking for revision A, which is not what git merge-base (--all) finds.

Fronnia answered 6/10, 2009 at 18:22 Comment(2)
See also Find the parent branch of a branch and Branch length: where does a branch start in Git?.Peirsen
With Git 2.36 (Q2 2022): git rev-parse $(git rev-list --exclude-first-parent-only ^main branch_A| tail -1)^. See my answer below.Teran
F
629

I was looking for the same thing, and I found this question. Thank you for asking it!

However, I found that the answers I see here don't seem to quite give the answer you asked for (or that I was looking for) -- they seem to give the G commit, instead of the A commit.

So, I've created the following tree (letters assigned in chronological order), so I could test things out:

A - B - D - F - G   <- "master" branch (at G)
     \   \     /
      C - E --'     <- "topic" branch (still at E)

This looks a little different than yours, because I wanted to make sure that I got (referring to this graph, not yours) B, but not A (and not D or E). Here are the letters attached to SHA prefixes and commit messages (my repo can be cloned from here, if that's interesting to anyone):

G: a9546a2 merge from topic back to master
F: e7c863d commit on master after master was merged to topic
E: 648ca35 merging master onto topic
D: 37ad159 post-branch commit on master
C: 132ee2a first commit on topic branch
B: 6aafd7f second commit on master before branching
A: 4112403 initial commit on master

So, the goal: find B. Here are three ways that I found, after a bit of tinkering:


1. visually, with gitk:

You should visually see a tree like this (as viewed from master):

gitk screen capture from master

or here (as viewed from topic):

gitk screen capture from topic

in both cases, I've selected the commit that is B in my graph. Once you click on it, its full SHA is presented in a text input field just below the graph.


2. visually, but from the terminal:

git log --graph --oneline --all

(Edit/side-note: adding --decorate can also be interesting; it adds an indication of branch names, tags, etc. Not adding this to the command-line above since the output below doesn't reflect its use.)

which shows (assuming git config --global color.ui auto):

output of git log --graph --oneline --all

Or, in straight text:

*   a9546a2 merge from topic back to master
|\  
| *   648ca35 merging master onto topic
| |\  
| * | 132ee2a first commit on topic branch
* | | e7c863d commit on master after master was merged to topic
| |/  
|/|   
* | 37ad159 post-branch commit on master
|/  
* 6aafd7f second commit on master before branching
* 4112403 initial commit on master

in either case, we see the 6aafd7f commit as the lowest common point, i.e. B in my graph, or A in yours.


3. With shell magic:

You don't specify in your question whether you wanted something like the above, or a single command that'll just get you the one revision, and nothing else. Well, here's the latter:

diff -u <(git rev-list --first-parent topic) \
             <(git rev-list --first-parent master) | \
     sed -ne 's/^ //p' | head -1
6aafd7ff98017c816033df18395c5c1e7829960d

Which you can also put into your ~/.gitconfig as (note: trailing dash is important; thanks Brian for bringing attention to that):

[alias]
    oldest-ancestor = !zsh -c 'diff -u <(git rev-list --first-parent "${1:-master}") <(git rev-list --first-parent "${2:-HEAD}") | sed -ne \"s/^ //p\" | head -1' -

Which could be done via the following (convoluted with quoting) command-line:

git config --global alias.oldest-ancestor '!zsh -c '\''diff -u <(git rev-list --first-parent "${1:-master}") <(git rev-list --first-parent "${2:-HEAD}") | sed -ne "s/^ //p" | head -1'\'' -'

Note: zsh could just as easily have been bash, but sh will not work -- the <() syntax doesn't exist in vanilla sh. (Thank you again, @conny, for making me aware of it in a comment on another answer on this page!)

Note: Alternate version of the above:

Thanks to liori for pointing out that the above could fall down when comparing identical branches, and coming up with an alternate diff form which removes the sed form from the mix, and makes this "safer" (i.e. it returns a result (namely, the most recent commit) even when you compare master to master):

As a .git-config line:

[alias]
    oldest-ancestor = !zsh -c 'diff --old-line-format='' --new-line-format='' <(git rev-list --first-parent "${1:-master}") <(git rev-list --first-parent "${2:-HEAD}") | head -1' -

From the shell:

git config --global alias.oldest-ancestor '!zsh -c '\''diff --old-line-format='' --new-line-format='' <(git rev-list --first-parent "${1:-master}") <(git rev-list --first-parent "${2:-HEAD}") | head -1'\'' -'

So, in my test tree (which was unavailable for a while, sorry; it's back), that now works on both master and topic (giving commits G and B, respectively). Thanks again, liori, for the alternate form.


So, that's what I [and liori] came up with. It seems to work for me. It also allows an additional couple of aliases that might prove handy:

git config --global alias.branchdiff '!sh -c "git diff `git oldest-ancestor`.."'
git config --global alias.branchlog '!sh -c "git log `git oldest-ancestor`.."'

Happy git-ing!

Fiat answered 14/2, 2011 at 11:35 Comment(34)
Thanks lindes, the shell option is great for situations where you want to find the branch point of a long running maintenance branch. When you are looking for a revision that might be a thousand commits in the past, the visual options really isn't going to cut it. *8')Codify
@MarkBooth: Heh, indeed. Glad to help!Fiat
In your third method you depend on that the context will show the first unchanged line. This won't happen in certain edge cases or if you happen to have slightly different requirements (e.g. I need only the one of the histories be --first-parent, and I am using this method in a script that might sometimes use the same branches on both sides). I found it safer to use diff's if-then-else mode and erase changed/deleted lines from its output instead of counting on having big enough context., by: diff --old-line-format='' --new-line-format='' <(git rev-list …) <(git rev-list …)|head -1.Donovan
Note the trailing dash at the end of the oldest-ancestor alias. Without it, the positional parameters will be wrong.Habituate
Thanks, @BrianWhite, for pointing that out. Which also took me over the activity threshold to finally take in liori's comment... [mentioning in a separate comment, because only one at-mention is allowed per comment. sigh.]Fiat
Much belated thanks, @liori, for pointing that out! I've added your version, and very thankful for it. That is indeed much better (presuming GNU diff is available, anyway). Thank you for helping to make a well-received answer even better.Fiat
Just curious, isn't it better to use git merge-base?Breathless
Well, in the stated question, it says that what's being looked for is something that's different than what git merge-base finds (the latter finds a recent ancestor, and the questioner is looking for the oldest ancestor). If there's a way to use git merge-base to get at what the questioner is looking for, please add your comments (or a new answer), describing that!Fiat
git log -n1 --format=format:%H $(git log --reverse --format=format:%H master..topic | head -1)~ will work as well, I thinkMcgrody
Does the 3rd solution works for anyone with Fast Forward commits? I am trying to use this but I have some fast forward commits coming in from the branch in question to master. And thus, the oldest-ancestor returns revision of the last forwarded commit, rather than the branch point. Is this just me or the solution is unable to handle this scenario?Fart
@SalmanA.Kagzi: Are all of the merge commits fast-forwarded? If so, then you essentially don't have any branches recorded in the tree... and then no, this wouldn't work. If it's only some of them, I'd have to dig in to see what happens; in that case, maybe you can say more about what you know of how things happened?Fiat
I think using --first-parent can give incorrect results depending on your merge procedures. Ours are not so clean, and --first-parent does NOT give the right answer, by any stretch, because the "correct" path through master doesn't always follow the first parent in our repo. Is there any advantage to use --first-parent other than efficiency (reduced output)?Kollwitz
@JakubNarębski: Do you have a way for git merge-base --fork-point ... to give commit B (commit 6aafd7f) for this tree? I was busy with other stuff when you posted that, and it sounded good, but I finally just tried it, and I'm not getting it to work... I either get something more recent, or just a silent failure (no error message, but exit status 1), trying arguments like git co master; git merge-base --fork-point topic, git co topic; git merge-base --fork-point master, git merge-base --fork-point topic master (for either checkout), etc. Is there something I'm doing wrong or missing?Fiat
diff -U1 <(git rev-list --first-parent topic) <(git rev-list --first-parent master) | tail -1 . By forcing diff context to have only one line, you don't need to grep/sed through it.Grounder
@JakubNarębski @Fiat --fork-point is based on the reflog, so it will only work if you made the changes locally. Even then, the reflog entries could have expired. It's useful but not reliable at all.Aerometer
diff ... | sed ... | head -1 and also diff ... | tail -1 are 2 too 100 times slower than using comm like this: comm --nocheck-order -1 -2 <(git rev-list --first-parent topic) <(git rev-list --first-parent master) | head -1Pryer
Any alternatives when zsh is not available?Divergence
@konyak: bash should work the same, for I think anything in here. If I've missed something, please let me know what fails and how, and I can look into that.Fiat
In my case, where there are merges back from master into topic, the oldest-ancestor command only gives the most recent merge from master into topic, not where the branch first started. If there any way around this?Harvester
@Andrei-Neculau comm assumes the inputs are lexically sorted. It has optimizations that cause it to sometimes give wrong results when they are not. The hashes are not in lexicographical order, so comm won't work here. I ran into this in practice.Corelli
@Corelli comm --nocheck-order assumes the inputs are lexically sorted?!Pryer
To avoid "abigous argument" error, add -- after the branch names. So in the .gitconfig : oldest-ancestor = !bash -c 'diff -u <(git rev-list --first-parent \"${1:-master}\" -- ) <(git rev-list --first-parent \"${2:-HEAD}\" -- ) | sed -ne \"s/^ //p\" | head -1' -Kreutzer
@AndreiNeculau man comm: "compare two sorted files line by line". Nocheck-order does precisely what it says, it makes comm not check the order. It still assumes they are ordered to function. At least the version I used ( GNU coreutils 8.25, I think) did.Corelli
Thanks, this is very educating, but... every time I try to get a simple answer for a simple git question, I'm drowned in smart options - none of which work for me. I have a huge, old, complicated repo. Tried your suggestions, I get commits much later than my branch -and don't make any sense. I do not understand --- doesn't git have any notion of the "branching event" ? WHEN was a branch created? Knowing the time it was created, I could look at "master" and see "B" for myself. With over 1000 branches - no graphic description can help me.Stumper
@MottiShneor: No, it doesn't have any notion of that. Commits merely have parents in the "DAG" (see e.g. https://mcmap.net/q/12359/-dag-vs-tree-using-git )... For example, try git cat-file -p HEAD just after a merge. You should see two "parent" commits listed... that (and similar sorts of data) is really all git knows, after the fact... the branch names are just labels pointing to commits, that get moved along when you commit on them. For any (not necessarily you) who are confused by all of this, I recommend reading: sbf5.com/~cduan/technical/gitFiat
And haven't anyone been smart enough to suggest adding a lightweight "original commit" addition to those branch "Labels"? why wasn't this done on the first design? As a user of many source-control systems over the last 35 years, I know how people think of branches, and vast majority of them think of them as "events" where something splits into different ways. What is the "design intelligence" in NOT knowing when and how this happened?Stumper
@MottiShneor: I think for the question of why things were done this way in the first place, you're going to need to ask Linus Torvalds (original author of git; or perhaps he's already written about it somewhere). That said, a "branch" in git very definitely does not correspond to an "event", and with things like rebasing, often what it refers to can change in a way where doing what you're talking about may well be hard to keep track of and/or lead to misleading or confusing results.Fiat
@PhilipRego: branch names are... somewhat ephemeral in git, at least as far as history is concerned. I'm fine with a downvote, given how many upvotes this answer has, but I think your frustration is rather with an aspect of git itself, and how it behaves, than with my answer. And some folks do keep track of commit numbers for some purposes. Definitely not what most folks do most of the time, but... there's no other name for commit 6aafd7f... that's just what it's called in git. (Or rather, that's short for a longer name/SHA - see git-scm.com/book/en/v2/Git-Internals-Git-Objects .)Fiat
@PhilipRego: seems like this maybe is running into the realm of specific tech support. (Want to hire me?) And sometimes there aren't easy answers, though if you think you have a distinct question to ask from the original question here, you could to ask it, and maybe get answers that way. I didn't go into the shell magic because why that works is off-topic, but basically: <(some command) runs some command, capturing output to a file whose filename is placed into the command line (for diff to use, in this case). More suggested reading: sbf5.com/~cduan/technical/gitFiat
Awesome! You can simplify a little by replacing --old-line-format='' --new-line-format='' with --changed-group-format=''Zeller
Your solutions ("3. With shell magic" and the alternate version) do not work with the bug_compound branch of the GNU MPFR repository. See my solution.Capitulation
@vinc17: hmm, what do you expect it to return? I get ca8e7f78f51ba4905c05c575d4ba93ce2f203e4a, which is the latest commit on bug_compound, but which is in a direct lineage from master -- i.e. there is no split to discover, at least not from what's visible in the git log. Strange in this is commit 3c08b03b6, though, which purports to merge master into bug_compound (not the other way around), and yet, has the latest bug_compound commit (ca8e7f78f) as an ancestor... I guess someone didn't push their bug_compound changes?? I don't know. Not sure this could do better here??Fiat
@Capitulation - oh, right, looking at your solution, I see it seems you were expecting 259db99b5... which is the parent of b9ad6bc37... which is one of two sides of a branch and merge, where bug_compound is on one side of that, with commit ca8e7f78f -- but you'll notice that that commit also shows up in git rev-list --reverse master, it just doesn't show up in bug_compound because (despite the merge comment in 3c08b03b6), it's not far enough forward. Anyway, I tried your solution on my demo repo, and it gives a different answer there, so... it's just differently brittle, I think.Fiat
@Fiat Yes, I expected 259db99b5, which is the parent of the first commit in the bug_compound branch (63cc8dfaf). The last commit in the bug_compound (just before the merge) is ca8e7f78f. I don't know why master was merged into bug_compound and not the other way around (though bug_compound was created from master). Anyway, on this example, this is not ambiguous and the only correct answer is 259db99b5.Capitulation
U
162

You may be looking for git merge-base:

git merge-base finds best common ancestor(s) between two commits to use in a three-way merge. One common ancestor is better than another common ancestor if the latter is an ancestor of the former. A common ancestor that does not have any better common ancestor is a best common ancestor, i.e. a merge base. Note that there can be more than one merge base for a pair of commits.

Undesigning answered 6/10, 2009 at 18:30 Comment(7)
Note also the --all option to "git merge-base"Goosy
This doesn't answer the original question, but most people asking the much simpler question for which this is the answer :)Ginetteginevra
he said he didn't wan't the result of git merge-baseLalo
@TomTanner: I just looked at the question history and the original question was edited to include that note about git merge-base five hours after my answer was posted (probably in response to my answer). Nevertheless, I will leave this answer as is because it may still be useful for somebody else who finds this question via search.Undesigning
I believe you could use the results from merge-base --all and then sift through the list of return commits using merge-base --is-ancestor to find the oldest common ancestor from the list, but this isn't the simplest way.Alemannic
@Zeller - you posted your useful suggestion in the wrong answer. You want this. thank you!Septuagint
Thanks @stason! Moved.Zeller
A
46

I've used git rev-list for this sort of thing. For example, (note the 3 dots)

$ git rev-list --boundary branch-a...master | grep "^-" | cut -c2-

will spit out the branch point. Now, it's not perfect; since you've merged master into branch A a couple of times, that'll split out a couple possible branch points (basically, the original branch point and then each point at which you merged master into branch A). However, it should at least narrow down the possibilities.

I've added that command to my aliases in ~/.gitconfig as:

[alias]
    diverges = !sh -c 'git rev-list --boundary $1...$2 | grep "^-" | cut -c2-'

so I can call it as:

$ git diverges branch-a master
Arbour answered 3/3, 2010 at 18:17 Comment(9)
Note: this seems to give the first commit on the branch, rather than the common ancestor. (i.e. it gives G instead of A, per the graph in the original question.) I have an answer that gets A, that I'll be posting presently.Fiat
@lindes: It gives the common ancestor in every case I've tried it. Do you have an example where it doesn't?Arbour
Yes. In my answer (which has a link to a repo you can clone; git checkout topic and then run this with topic in place of branch-a), it lists 648ca357b946939654da12aaf2dc072763f3caee and 37ad15952db5c065679d7fc31838369712f0b338 -- both 37ad159 and 648ca35 are in the ancestory of the current branches (the latter being the current HEAD of topic), but neither is the point before branching happened. Do you get something different?Fiat
@lindes: I was unable to clone your repo (possibly a permissions issue?).Arbour
Oops, sorry! Thank you for letting me know. I forgot to run git update-server-info. It should be good to go now. :)Fiat
It doesn't work. I tried with the test repository, and --boundary branch_A...master doesn't even show the right commit, so no amount of grepping is going to solve that =/Ginetteginevra
@Arbour I got two commits, one is the ancestor of both branches, and another one is the first commit on my branch-a. It would be ok, since I could always do head/tail, but seems that order of that commits is not determined. Is there any option for rev-list to force specific order, like "master branch commit first".Manifestation
I could not get it working in a CI. I get "ambiguous argument ci-dev...master': unknown revision or path not in the working tree. any ideas?Smoothspoken
Oh. I found it. need to origin/ci-dev...origin/master.Smoothspoken
P
37

If you like terse commands,

git rev-list $(git rev-list --first-parent ^branch_name master | tail -n1)^^! 

Here's an explanation.

The following command gives you the list of all commits in master that occurred after branch_name was created

git rev-list --first-parent ^branch_name master 

Since you only care about the earliest of those commits you want the last line of the output:

git rev-list ^branch_name --first-parent master | tail -n1

The parent of the earliest commit that's not an ancestor of "branch_name" is, by definition, in "branch_name," and is in "master" since it's an ancestor of something in "master." So you've got the earliest commit that's in both branches.

The command

git rev-list commit^^!

is just a way to show the parent commit reference. You could use

git log -1 commit^

or whatever.

PS: I disagree with the argument that ancestor order is irrelevant. It depends on what you want. For example, in this case

_C1___C2_______ master
  \    \_XXXXX_ branch A (the Xs denote arbitrary cross-overs between master and A)
   \_____/ branch B

it makes perfect sense to output C2 as the "branching" commit. This is when the developer branched out from "master." When he branched, branch "B" wasn't even merged in his branch! This is what the solution in this post gives.

If what you want is the last commit C such that all paths from origin to the last commit on branch "A" go through C, then you want to ignore ancestry order. That's purely topological and gives you an idea of since when you have two versions of the code going at the same time. That's when you'd go with merge-base based approaches, and it will return C1 in my example.

Parsec answered 29/8, 2012 at 19:25 Comment(4)
This is by far the cleanest answer, let's get this voted to the top. A suggested edit: git rev-list commit^^! can be simplified as git rev-parse commit^Arsphenamine
This answer is nice, I just replaced git rev-list --first-parent ^branch_name master with git rev-list --first-parent branch_name ^master because if the master branch is 0 commits ahead of the other branch (fast-forwardable to it), no output would be created. With my solution, no output is created if master is strictly ahead (i.e. the branch has been fully merged), which is what I want.Ginni
This won't work unless I'm totally missing something. There are merges in both directions in the example branches. It sounds like you tried to take that into account, but I believe this will cause your answer to fail. git rev-list --first-parent ^topic master will only take you back to the first commit after the last merge from master into topic (if that even exists).Shockley
@Shockley You are correct, this answer is garbage; for instance, in the case that a backmerge has just taken place (merged master into topic), and master has no new commits after that, the first git rev-list --first-parent command outputs nothing at all.Burrstone
C
28

Purpose: This answer tests the various answers presented in this thread.

Test repository

-- X -- A -- B -- C -- D -- F  (master) 
          \     /   \     /
           \   /     \   /
             G -- H -- I -- J  (branch A)
$ git --no-pager log --graph --oneline --all --decorate
* b80b645 (HEAD, branch_A) J - Work in branch_A branch
| *   3bd4054 (master) F - Merge branch_A into branch master
| |\  
| |/  
|/|   
* |   a06711b I - Merge master into branch_A
|\ \  
* | | bcad6a3 H - Work in branch_A
| | * b46632a D - Work in branch master
| |/  
| *   413851d C - Merge branch_A into branch master
| |\  
| |/  
|/|   
* | 6e343aa G - Work in branch_A
| * 89655bb B - Work in branch master
|/  
* 74c6405 (tag: branch_A_tag) A - Work in branch master
* 7a1c939 X - Work in branch master

Correct solutions

The only solution which works is the one provided by lindes correctly returns A:

$ diff -u <(git rev-list --first-parent branch_A) \
          <(git rev-list --first-parent master) | \
      sed -ne 's/^ //p' | head -1
74c6405d17e319bd0c07c690ed876d65d89618d5

As Charles Bailey points out though, this solution is very brittle.

If you branch_A into master and then merge master into branch_A without intervening commits then lindes' solution only gives you the most recent first divergance.

That means that for my workflow, I think I'm going to have to stick with tagging the branch point of long running branches, since I can't guarantee that they can be reliably be found later.

This really all boils down to gits lack of what hg calls named branches. The blogger jhw calls these lineages vs. families in his article Why I Like Mercurial More Than Git and his follow-up article More On Mercurial vs. Git (with Graphs!). I would recommend people read them to see why some mercurial converts miss not having named branches in git.

Incorrect solutions

The solution provided by mipadi returns two answers, I and C:

$ git rev-list --boundary branch_A...master | grep ^- | cut -c2-
a06711b55cf7275e8c3c843748daaa0aa75aef54
413851dfecab2718a3692a4bba13b50b81e36afc

The solution provided by Greg Hewgill return I

$ git merge-base master branch_A
a06711b55cf7275e8c3c843748daaa0aa75aef54
$ git merge-base --all master branch_A
a06711b55cf7275e8c3c843748daaa0aa75aef54

The solution provided by Karl returns X:

$ diff -u <(git log --pretty=oneline branch_A) \
          <(git log --pretty=oneline master) | \
       tail -1 | cut -c 2-42
7a1c939ec325515acfccb79040b2e4e1c3e7bbe5

Test repository reproduction

To create a test repository:

mkdir $1
cd $1
git init
git commit --allow-empty -m "X - Work in branch master"
git commit --allow-empty -m "A - Work in branch master"
git branch branch_A
git tag branch_A_tag     -m "Tag branch point of branch_A"
git commit --allow-empty -m "B - Work in branch master"
git checkout branch_A
git commit --allow-empty -m "G - Work in branch_A"
git checkout master
git merge branch_A       -m "C - Merge branch_A into branch master"
git checkout branch_A
git commit --allow-empty -m "H - Work in branch_A"
git merge master         -m "I - Merge master into branch_A"
git checkout master
git commit --allow-empty -m "D - Work in branch master"
git merge branch_A       -m "F - Merge branch_A into branch master"
git checkout branch_A
git commit --allow-empty -m "J - Work in branch_A branch"

My only addition is the tag which makes it explicit about the point at which we created the branch and thus the commit we wish to find.

I doubt the git version makes much difference to this, but:

$ git --version
git version 1.7.1

Thanks to Charles Bailey for showing me a more compact way to script the example repository.

Codify answered 6/10, 2009 at 18:22 Comment(7)
The solution by Karl is easy to fix: diff -u <(git rev-list branch_A) <(git rev-list master) | tail -2 | head -1. Thanks for providing instructions to create the repo :)Ginetteginevra
I think you mean "The cleaned up variation of the solution provided by Karl returns X". The original worked fine it was just ugly :-)Terrigenous
Nope, your original does not work fine. Granted, the variation works even worst. But adding the option --topo-order makes your version work :)Ginetteginevra
@felipec - See my final comment on the answer by Charles Bailey. Alas our chat (and thus all of the old comments) have now been deleted. I will try to update my answer when I get the time.Codify
Interesting. I'd sort of assumed topological was the default. Silly me :-)Terrigenous
Your replication script does not follow chronological order for the letters. X A B G C H I D F J...Denney
Your solution with the git rev-list --first-parent does not work with the bug_compound branch of the GNU MPFR repository. See my solution.Capitulation
A
13

In general, this is not possible. In a branch history a branch-and-merge before a named branch was branched off and an intermediate branch of two named branches look the same.

In git, branches are just the current names of the tips of sections of history. They don't really have a strong identity.

This isn't usually a big issue as the merge-base (see Greg Hewgill's answer) of two commits is usually much more useful, giving the most recent commit which the two branches shared.

A solution relying on the order of parents of a commit obviously won't work in situations where a branch has been fully integrated at some point in the branch's history.

git commit --allow-empty -m root # actual branch commit
git checkout -b branch_A
git commit --allow-empty -m  "branch_A commit"
git checkout master
git commit --allow-empty -m "More work on master"
git merge -m "Merge branch_A into master" branch_A # identified as branch point
git checkout branch_A
git merge --ff-only master
git commit --allow-empty -m "More work on branch_A"
git checkout master
git commit --allow-empty -m "More work on master"

This technique also falls down if an integration merge has been made with the parents reversed (e.g. a temporary branch was used to perform a test merge into master and then fast-forwarded into the feature branch to build on further).

git commit --allow-empty -m root # actual branch point
git checkout -b branch_A
git commit --allow-empty -m  "branch_A commit"
git checkout master
git commit --allow-empty -m "More work on master"
git merge -m "Merge branch_A into master" branch_A # identified as branch point
git checkout branch_A
git commit --allow-empty -m "More work on branch_A"

git checkout -b tmp-branch master
git merge -m "Merge branch_A into tmp-branch (master copy)" branch_A
git checkout branch_A
git merge --ff-only tmp-branch
git branch -d tmp-branch

git checkout master
git commit --allow-empty -m "More work on master"
Almucantar answered 6/10, 2009 at 18:36 Comment(5)
let us continue this discussion in chatCodify
Thanks Charles, you've convinced me, if I want to know the point at which the branch originally diverged, I'm going to have to tag it. I really wish that git had an equivalent to hg's named branches, it would make managing long lived maintenance branches so much easier.Codify
"In git, branches are just the current names of the tips of sections of history. They don't really have a strong identity" That's a scary thing to say and has convinced me that I need to understand Git branches better - thanks (+1)Mournful
In a branch history a branch-and-merge before a named branch was branched off and an intermediate branch of two named branches look the same. Yep. +1.Wellworn
Yep, Git doesn't consider this important. It's convinced me to continue with Hg for now because this is a very critical part of history in software development.Aiello
T
11

Git 2.36 proposes a simpler command from:

(branch_A_tag)
     |
--X--A--B--C--D--F  (master) 
      \   / \   /
       \ /   \ /
        G--H--I--J  (branch A)
vonc@vclp MINGW64 ~/git/tests/branchOrigin (branch_A)
git log -1 --decorate --oneline \
  $(git rev-parse \
     $(git rev-list --exclude-first-parent-only ^main branch_A| tail -1)^ \
   )
 80e8436 (tag: branch_A_tag) A - Work in branch main
  • git rev-list --exclude-first-parent-only ^main branch_A gives you J -- I -- H -- G
  • tail -1 gives you G
  • git rev-parse G^ gives you its first parent: A or branch_A_tag

With the test script:

mkdir branchOrigin
cd branchOrigin
git init -b main
git commit --allow-empty -m "X - Work in branch main"
git commit --allow-empty -m "A - Work in branch main"
git tag branch_A_tag     -m "Tag branch point of branch_A"
git commit --allow-empty -m "B - Work in branch main"
git switch -c branch_A branch_A_tag
git commit --allow-empty -m "G - Work in branch_A"
git switch main
git merge branch_A       -m "C - Merge branch_A into branch main"
git switch branch_A
git commit --allow-empty -m "H - Work in branch_A"
git merge main         -m "I - Merge main into branch_A"
git switch main
git commit --allow-empty -m "D - Work in branch main"
git merge branch_A       -m "F - Merge branch_A into branch main"
git switch branch_A
git commit --allow-empty -m "J - Work in branch_A branch"

Which gives you:

vonc@vclp MINGW64 ~/git/tests/branchOrigin (branch_A)
$ git log --oneline --decorate --graph --branches --all
* a55a87e (HEAD -> branch_A) J - Work in branch_A branch
| *   3769cc8 (main) F - Merge branch_A into branch main
| |\
| |/
|/|
* |   1b29fa5 I - Merge main into branch_A
|\ \
* | | e7accbd H - Work in branch_A
| | * 87a62f4 D - Work in branch main
| |/
| *   7bc79c5 C - Merge branch_A into branch main
| |\
| |/
|/|
* | 0f28c9f G - Work in branch_A
| * e897627 B - Work in branch main
|/
* 80e8436 (tag: branch_A_tag) A - Work in branch main
* 5cad19b X - Work in branch main

Which is:

(branch_A_tag)
     |
--X--A--B--C--D--F  (master) 
      \   / \   /
       \ /   \ /
        G--H--I--J  (branch A)

With Git 2.36 (Q2 2022), "git log"(man) and friends learned an option --exclude-first-parent-only to propagate UNINTERESTING bit down only along the first-parent chain, just like --first-parent option shows commits that lack the UNINTERESTING bit only along the first-parent chain.

See commit 9d505b7 (11 Jan 2022) by Jerry Zhang (jerry-skydio).
(Merged by Junio C Hamano -- gitster -- in commit 708cbef, 17 Feb 2022)

git-rev-list: add --exclude-first-parent-only flag

Signed-off-by: Jerry Zhang

It is useful to know when a branch first diverged in history from some integration branch in order to be able to enumerate the user's local changes.
However, these local changes can include arbitrary merges, so it is necessary to ignore this merge structure when finding the divergence point.

In order to do this, teach the "rev-list" family to accept "--exclude-first-parent-only", which restricts the traversal of excluded commits to only follow first parent links.

-A-----E-F-G--main
  \   / /
   B-C-D--topic

In this example, the goal is to return the set {B, C, D} which represents a topic branch that has been merged into main branch.
git rev-list topic ^main(man) will end up returning no commits since excluding main will end up traversing the commits on topic as well.
git rev-list --exclude-first-parent-only topic ^main(man) however will return {B, C, D} as desired.

Add docs for the new flag, and clarify the doc for --first-parent to indicate that it applies to traversing the set of included commits only.

rev-list-options now includes in its man page:

--first-parent

When finding commits to include, follow only the first parent commit upon seeing a merge commit.

This option can give a better overview when viewing the evolution of a particular topic branch, because merges into a topic branch tend to be only about adjusting to updated upstream from time to time, and this option allows you to ignore the individual commits brought in to your history by such a merge.

rev-list-options now includes in its man page:

--exclude-first-parent-only

When finding commits to exclude (with a '{caret}'), follow only the first parent commit upon seeing a merge commit.

This can be used to find the set of changes in a topic branch from the point where it diverged from the remote branch, given that arbitrary merges can be valid topic branch changes.


As noted by anarcat in the comments, if your branch does not derive from master, but from main, or prod, or... any other branch, you can use:

git for-each-ref --merged="$local_ref" --no-contains="$local_ref" \
    --format="%(refname:strip=-1)" --sort='-*authordate' refs/heads

philb also mentions in the comments the --boundary option (Output excluded boundary commits. Boundary commits are prefixed with -):

git rev-list --exclude-first-parent-only --boundary ^main  branch_A | tail -1 

to get A directly, without needing an additional git rev-parse G^

Teran answered 20/2, 2022 at 11:21 Comment(11)
I am not able to confirm that this works. I tried it with the test from above. git rev-list... returns J, I, H and G (in that order). git rev-list ...|head -1 returns J. git rev-parse ... gives me F. I assume "A" in the command of this answer to be the "branch A". Tested with git version 2.36.1.windows.1 operating on branch A.Verdict
@Verdict I agree. I have now fully tested that scenario, and fixed the git rev-parse/rev-list commands. Let me know if the revised answer works better for you.Teran
Yes, I can confirm, that the current version outputs the result I expected. Thank you very much for the follow-up! 🙏Verdict
all of the answers here assume you're branching from "master" which is not necessarily a given (you could branch from "prod" or "main" or whatever). i'm using the above with this mouthful instead of "main": git for-each-ref --merged="$local_ref" --no-contains="$local_ref" --format="%(refname:strip=-1)" --sort='-*authordate' refs/headsAnalyze
One could probably add --boundary: git rev-list --exclude-first-parent-only --boundary ^main branch_A | tail -1 to get A directly, without needing an additional git rev-parse G^Scabby
@Scabby I tested with git 2.42.0 and the test case provided in this answer and git rev-list --exclude-first-parent-only --boundary ^main branch_A | tail -1 returns commit C, not AElectro
@GabrielDevillers So you are saying that philb's option does not work. Interesting, not sure what changed.Teran
Indeed, just tested it and with --boundary, commit A is listed as a boundary commit, but so is commit C, and for some reason C is last, so tail -1 returns it. That's also the case with --topo-order, which is surprising to me, since C is a descendent of A...Scabby
@Scabby OK, thank you for your first edit. Could you add your conclusion in a second edit to this answer (last section)?Teran
@Teran I was wondering if this function is commutative. My guess was it was not given how it is defined. Surprisingly my test script found it commutative on all commits from your example. I was able to find a counter example but unfortunately I did not start from your example so I cannot suggest a change to it.Electro
@GabrielDevillers Interesting. It would be best to discuss it with a new question: those comments are a bit too crowded.Teran
T
7

How about something like

git log --pretty=oneline master > 1
git log --pretty=oneline branch_A > 2

git rev-parse `diff 1 2 | tail -1 | cut -c 3-42`^
Terrigenous answered 5/11, 2009 at 10:30 Comment(7)
This works. It's really cumbersome but it's the only thing I've found that actually seems to do the job.Cabbagehead
Git alias equivalent: diverges = !bash -c 'git rev-parse $(diff <(git log --pretty=oneline ${1}) <(git log --pretty=oneline ${2}) | tail -1 | cut -c 3-42)^' (with no temporary files)Discoverer
@conny: Oh, wow -- I'd never seen the <(foo) syntax... that's incredibly useful, thanks! (Works in zsh, too, FYI.)Fiat
this seems to give me the first commit on the branch, rather than the common ancestor. (i.e. it gives G instead of A, per the graph in the original question.) I think I've found an answer, though, which I'll post presently.Fiat
Instead of 'git log --pretty=oneline' you can just do 'git rev-list', then you can skip the cut as well, moreover, this gives the parent commit of the point of divergence, so just tail -2 | head 1. So: diff -u <(git rev-list branch_A) <(git rev-list master) | tail -2 | head -1Ginetteginevra
Er, looks like both Karl and my cleaned up version need --topo-order to work: diff -u <(git rev-list --topo-order branch_A) <(git rev-list --topo-order master) | tail -2 | head -1Ginetteginevra
AFAICS, the tail -2 relies on a unified diff output of 2 context lines, but for me it is 3 lines, so I am getting the wrong commit (X in the OP example) so I need to do tail -3. Unless this approach is broken in my case with lots of commits and feature branches left and right. One can set it using diff -U 2 to make it less fragile, also remove the leading space using cut: diff -U 2 <(git rev-list --topo-order branch_A) <(git rev-list --topo-order master) | tail -2 | head -1 | cut -c 2-Photo
H
7

A simple way to just make it easier to see the branching point in git log --graph is to use the option --first-parent.

For example, take the repo from the accepted answer:

$ git log --all --oneline --decorate --graph

*   a9546a2 (HEAD -> master, origin/master, origin/HEAD) merge from topic back to master
|\  
| *   648ca35 (origin/topic) merging master onto topic
| |\  
| * | 132ee2a first commit on topic branch
* | | e7c863d commit on master after master was merged to topic
| |/  
|/|   
* | 37ad159 post-branch commit on master
|/  
* 6aafd7f second commit on master before branching
* 4112403 initial commit on master

Now add --first-parent:

$ git log --all --oneline --decorate --graph --first-parent

* a9546a2 (HEAD -> master, origin/master, origin/HEAD) merge from topic back to master
| * 648ca35 (origin/topic) merging master onto topic
| * 132ee2a first commit on topic branch
* | e7c863d commit on master after master was merged to topic
* | 37ad159 post-branch commit on master
|/  
* 6aafd7f second commit on master before branching
* 4112403 initial commit on master

That makes it easier!

Note if the repo has lots of branches you're going to want to specify the 2 branches you're comparing instead of using --all:

$ git log --decorate --oneline --graph --first-parent master origin/topic
Hexone answered 2/4, 2020 at 12:53 Comment(0)
R
6

surely I'm missing something, but IMO, all the problems above are caused because we are always trying to find the branch point going back in the history, and that causes all sort of problems because of the merging combinations available.

Instead, I've followed a different approach, based in the fact that both branches share a lot of history, exactly all the history before branching is 100% the same, so instead of going back, my proposal is about going forward (from 1st commit), looking for the 1st difference in both branches. The branch point will be, simply, the parent of the first difference found.

In practice:

#!/bin/bash
diff <( git rev-list "${1:-master}" --reverse --topo-order ) \
     <( git rev-list "${2:-HEAD}" --reverse --topo-order) \
--unified=1 | sed -ne 's/^ //p' | head -1

And it's solving all my usual cases. Sure there are border ones not covered but... ciao :-)

Renown answered 1/6, 2012 at 11:47 Comment(2)
diff <( git rev-list "${1:-master}" --first-parent ) <( git rev-list "${2:-HEAD}" --first-parent) -U1 | tail -1Grounder
i found this to be faster (2-100x): comm --nocheck-order -1 -2 <(git rev-list --reverse --topo-order topic) <(git rev-list --reverse --topo-order master) | head -1Pryer
G
5

After a lot of research and discussions, it's clear there's no magic bullet that would work in all situations, at least not in the current version of Git.

That's why I wrote a couple of patches that add the concept of a tail branch. Each time a branch is created, a pointer to the original point is created too, the tail ref. This ref gets updated every time the branch is rebased.

To find out the branch point of the devel branch, all you have to do is use devel@{tail}, that's it.

https://github.com/felipec/git/commits/fc/tail

Ginetteginevra answered 1/10, 2013 at 4:39 Comment(2)
Might be the only stable solution. Did you see if this could get into git? I didn't see a pull request.Photo
@AlexanderKlimetschek I didn't send the patches, and I don't think those would be accepted. However, I tried a different method: an "update-branch" hook which does something very similar. This way by default Git wouldn't do anything, but you could enable the hook to update the tail branch. You wouldn't have devel@{tail} though, but wouldn't be so bad to use tails/devel instead.Ginetteginevra
W
4

I recently needed to solve this problem as well and ended up writing a Ruby script for this: https://github.com/vaneyckt/git-find-branching-point

Whiles answered 29/7, 2012 at 14:57 Comment(1)
It is not working, grit failed in "unpack_object_header_gently" and is not maintained.Honea
L
4

I seem to be getting some joy with

git rev-list branch...master

The last line you get is the first commit on the branch, so then it's a matter of getting the parent of that. So

git rev-list -1 `git rev-list branch...master | tail -1`^

Seems to work for me and doesn't need diffs and so on (which is helpful as we don't have that version of diff)

Correction: This doesn't work if you are on the master branch, but I'm doing this in a script so that's less of an issue

Lalo answered 21/3, 2013 at 9:34 Comment(0)
M
4

Sometimes it is effectively impossible (with some exceptions of where you might be lucky to have additional data) and the solutions here wont work.

Git doesn't preserve ref history (which includes branches). It only stores the current position for each branch (the head). This means you can lose some branch history in git over time. Whenever you branch for example, it's immediately lost which branch was the original one. All a branch does is:

git checkout branch1    # refs/branch1 -> commit1
git checkout -b branch2 # branch2 -> commit1

You might assume that the first commited to is the branch. This tends to be the case but it's not always so. There's nothing stopping you from commiting to either branch first after the above operation. Additionally, git timestamps aren't guaranteed to be reliable. It's not until you commit to both that they truly become branches structurally.

While in diagrams we tend to number commits conceptually, git has no real stable concept of sequence when the commit tree branches. In this case you can assume the numbers (indicating order) are determined by timestamp (it might be fun to see how a git UI handles things when you set all the timestamps to the same).

This is what a human expect conceptually:

After branch:
       C1 (B1)
      /
    -
      \
       C1 (B2)
After first commit:
       C1 (B1)
      /
    - 
      \
       C1 - C2 (B2)

This is what you actually get:

After branch:
    - C1 (B1) (B2)
After first commit (human):
    - C1 (B1)
        \
         C2 (B2)
After first commit (real):
    - C1 (B1) - C2 (B2)

You would assume B1 to be the original branch but it could infact simply be a dead branch (someone did checkout -b but never committed to it). It's not until you commit to both that you get a legitimate branch structure within git:

Either:
      / - C2 (B1)
    -- C1
      \ - C3 (B2)
Or:
      / - C3 (B1)
    -- C1
      \ - C2 (B2)

You always know that C1 came before C2 and C3 but you never reliably know if C2 came before C3 or C3 came before C2 (because you can set the time on your workstation to anything for example). B1 and B2 is also misleading as you can't know which branch came first. You can make a very good and usually accurate guess at it in many cases. It is a bit like a race track. All things generally being equal with the cars then you can assume that a car that comes in a lap behind started a lap behind. We also have conventions that are very reliable, for example master will nearly always represent the longest lived branches although sadly I have seen cases where even this is not the case.

The example given here is a history preserving example:

Human:
    - X - A - B - C - D - F (B1)
           \     / \     /
            G - H ----- I - J (B2)
Real:
            B ----- C - D - F (B1)
           /       / \     /
    - X - A       /   \   /
           \     /     \ /
            G - H ----- I - J (B2)

Real here is also misleading because we as humans read it left to right, root to leaf (ref). Git does not do that. Where we do (A->B) in our heads git does (A<-B or B->A). It reads it from ref to root. Refs can be anywhere but tend to be leafs, at least for active branches. A ref points to a commit and commits only contain a like to their parent/s, not to their children. When a commit is a merge commit it will have more than one parent. The first parent is always the original commit that was merged into. The other parents are always commits that were merged into the original commit.

Paths:
    F->(D->(C->(B->(A->X)),(H->(G->(A->X))))),(I->(H->(G->(A->X))),(C->(B->(A->X)),(H->(G->(A->X)))))
    J->(I->(H->(G->(A->X))),(C->(B->(A->X)),(H->(G->(A->X)))))

This is not a very efficient representation, rather an expression of all the paths git can take from each ref (B1 and B2).

Git's internal storage looks more like this (not that A as a parent appears twice):

    F->D,I | D->C | C->B,H | B->A | A->X | J->I | I->H,C | H->G | G->A

If you dump a raw git commit you'll see zero or more parent fields. If there are zero, it means no parent and the commit is a root (you can actually have multiple roots). If there's one, it means there was no merge and it's not a root commit. If there is more than one it means that the commit is the result of a merge and all of the parents after the first are merge commits.

Paths simplified:
    F->(D->C),I | J->I | I->H,C | C->(B->A),H | H->(G->A) | A->X
Paths first parents only:
    F->(D->(C->(B->(A->X)))) | F->D->C->B->A->X
    J->(I->(H->(G->(A->X))) | J->I->H->G->A->X
Or:
    F->D->C | J->I | I->H | C->B->A | H->G->A | A->X
Paths first parents only simplified:
    F->D->C->B->A | J->I->->G->A | A->X
Topological:
    - X - A - B - C - D - F (B1)
           \
            G - H - I - J (B2)

When both hit A their chain will be the same, before that their chain will be entirely different. The first commit another two commits have in common is the common ancestor and from whence they diverged. there might be some confusion here between the terms commit, branch and ref. You can in fact merge a commit. This is what merge really does. A ref simply points to a commit and a branch is nothing more than a ref in the folder .git/refs/heads, the folder location is what determines that a ref is a branch rather than something else such as a tag.

Where you lose history is that merge will do one of two things depending on circumstances.

Consider:

      / - B (B1)
    - A
      \ - C (B2)

In this case a merge in either direction will create a new commit with the first parent as the commit pointed to by the current checked out branch and the second parent as the commit at the tip of the branch you merged into your current branch. It has to create a new commit as both branches have changes since their common ancestor that must be combined.

      / - B - D (B1)
    - A      /
      \ --- C (B2)

At this point D (B1) now has both sets of changes from both branches (itself and B2). However the second branch doesn't have the changes from B1. If you merge the changes from B1 into B2 so that they are syncronised then you might expect something that looks like this (you can force git merge to do it like this however with --no-ff):

Expected:
      / - B - D (B1)
    - A      / \
      \ --- C - E (B2)
Reality:
      / - B - D (B1) (B2)
    - A      /
      \ --- C

You will get that even if B1 has additional commits. As long as there aren't changes in B2 that B1 doesn't have, the two branches will be merged. It does a fast forward which is like a rebase (rebases also eat or linearise history), except unlike a rebase as only one branch has a change set it doesn't have to apply a changeset from one branch on top of that from another.

From:
      / - B - D - E (B1)
    - A      /
      \ --- C (B2)
To:
      / - B - D - E (B1) (B2)
    - A      /
      \ --- C

If you cease work on B1 then things are largely fine for preserving history in the long run. Only B1 (which might be master) will advance typically so the location of B2 in B2's history successfully represents the point that it was merged into B1. This is what git expects you to do, to branch B from A, then you can merge A into B as much as you like as changes accumulate, however when merging B back into A, it's not expected that you will work on B and further. If you carry on working on your branch after fast forward merging it back into the branch you were working on then your erasing B's previous history each time. You're really creating a new branch each time after fast forward commit to source then commit to branch. You end up with when you fast forward commit is lots of branches/merges that you can see in the history and structure but without the ability to determine what the name of that branch was or if what looks like two separate branches is really the same branch.

         0   1   2   3   4 (B1)
        /-\ /-\ /-\ /-\ /
    ----   -   -   -   -
        \-/ \-/ \-/ \-/ \
         5   6   7   8   9 (B2)

1 to 3 and 5 to 8 are structural branches that show up if you follow the history for either 4 or 9. There's no way in git to know which of this unnamed and unreferenced structural branches belong to with of the named and references branches as the end of the structure. You might assume from this drawing that 0 to 4 belongs to B1 and 4 to 9 belongs to B2 but apart from 4 and 9 was can't know which branch belongs to which branch, I've simply drawn it in a way that gives the illusion of that. 0 might belong to B2 and 5 might belong to B1. There are 16 different possibilies in this case of which named branch each of the structural branches could belong to. This is assuming that none of these structural branches came from a deleted branch or as a result of merging a branch into itself when pulling from master (the same branch name on two repos is infact two branches, a separate repository is like branching all branches).

There are a number of git strategies that work around this. You can force git merge to never fast forward and always create a merge branch. A horrible way to preserve branch history is with tags and/or branches (tags are really recommended) according to some convention of your choosing. I realy wouldn't recommend a dummy empty commit in the branch you're merging into. A very common convention is to not merge into an integration branch until you want to genuinely close your branch. This is a practice that people should attempt to adhere to as otherwise you're working around the point of having branches. However in the real world the ideal is not always practical meaning doing the right thing is not viable for every situation. If what you're doing on a branch is isolated that can work but otherwise you might be in a situation where when multiple developers are working one something they need to share their changes quickly (ideally you might really want to be working on one branch but not all situations suit that either and generally two people working on a branch is something you want to avoid).

Mccue answered 9/11, 2017 at 14:17 Comment(1)
"Git doesn't preserve ref history" It does, but not by default and not for extensive time. See man git-reflog and the part about dates: "master@{one.week.ago} means "where master used to point to one week ago in this local repository"". Or the discussion on <refname>@{<date>} in man gitrevisions. And core.reflogExpire in man git-config.Tattan
G
3

Here's an improved version of my previous answer previous answer. It relies on the commit messages from merges to find where the branch was first created.

It works on all the repositories mentioned here, and I've even addressed some tricky ones that spawned on the mailing list. I also wrote tests for this.

find_merge ()
{
    local selection extra
    test "$2" && extra=" into $2"
    git rev-list --min-parents=2 --grep="Merge branch '$1'$extra" --topo-order ${3:---all} | tail -1
}

branch_point ()
{
    local first_merge second_merge merge
    first_merge=$(find_merge $1 "" "$1 $2")
    second_merge=$(find_merge $2 $1 $first_merge)
    merge=${second_merge:-$first_merge}

    if [ "$merge" ]; then
        git merge-base $merge^1 $merge^2
    else
        git merge-base $1 $2
    fi
}
Ginetteginevra answered 30/5, 2012 at 17:51 Comment(0)
A
3

The following command will reveal the SHA1 of Commit A

git merge-base --fork-point A

Alessandro answered 3/5, 2017 at 12:1 Comment(2)
This wont if parent and child branches have intermediate merges of each other in between.Euphonium
Original poster specifies that this won't work and that he's looking for something else.Killingsworth
C
3

Not quite a solution to the question but I thought it was worth noting the the approach I use when I have a long-living branch:

At the same time I create the branch, I also create a tag with the same name but with an -init suffix, for example feature-branch and feature-branch-init.

(It is kind of bizarre that this is such a hard question to answer!)

Cobham answered 31/1, 2018 at 13:14 Comment(2)
Considering the sheer mind-boggling stupidity of designing a concept of "branch" without any notion of When and where it was created... plus the immense complications of other suggested solutions - by people trying to out-smart this way-too-smart thing, I think I'd prefer your solution. Only it burdens one with the need to REMEMBER to do this every time you create a branch - a thing git users are making very often. In addition - I read somewhere that 'tag's have a penalty of being 'heavy'. Still, I think that's what I think I'll do.Stumper
is there a way to automatize this? a wayt to tell git to do it automatically for you? I think that would definetely be the best approachCentimeter
B
2

Seems like using reflog solves this git reflog <branchname> shows all the commits of the branch including branch creation.

This is from a branch that had 2 commits before it was merged back to master.

git reflog june-browser-updates
b898b15 (origin/june-browser-updates, june-browser-updates) june-browser-updates@{0}: commit: sorted cve.csv
467ae0e june-browser-updates@{1}: commit: browser updates and cve additions
d6a37fb june-browser-updates@{2}: branch: Created from HEAD
Booking answered 17/6, 2021 at 20:17 Comment(1)
git reflog <branchname> | tail -n 1 | cut -f1 -d' ' would give you the short hash of the parent the branch came fromBooking
R
1

To find commits from the branching point, you could use this.

git log --ancestry-path master..topicbranch
Rupee answered 11/7, 2014 at 16:32 Comment(1)
This command does not work for me on the given exmple. Please what would you provide as parameters for the commit range?Fabriane
C
1

A solution is to find the first divergence, then take the parent. With zsh, this can be done as follows (EDIT: compared to my first answer, I've added the missing --topo-order; I forgot it as I did tests on a repository where all commits had the same date, as generated by a script):

git rev-parse "$(diff <(git rev-list --topo-order --reverse master) \
                      <(git rev-list --topo-order --reverse branch_A) | \
                 sed -n '2{s/^. //p;q}')^"

The sed selects the first commit that appears in only one of the branches. Then the git rev-parse "$(...)^" outputs its parent.

Notes:

  • This works whether branch_A has been merged with master or not (in case of a merge, the head of branch_A corresponds to the last commit in this branch before the merge).
  • In some cases, the notion of branch point is ambiguous. So one may not get what one actually wants. This case also occurs in the GNU MPFR repository (4.0 branch).
  • The diff is not optimal since one just needs the first difference, but this seems to be the simplest solution with standard Unix utilities.

EDIT:

In case of a merge somewhere, it is possible that the first commits before the merge in one of the two branches are not part of the divergence seen by using diff (note that the result may still make sense in some cases, though). So it is better to look at the first divergence on both sides, i.e. for diff, the first insertion and the first deletion (when available).

Another possible issue with the above solution is that in case of a merge commit, ^ selects the first parent, while one would like merges to be regarded as symmetrical. So, ^@ is preferred in order to select both parents.

Finally, among the possible branch points obtained with the above remarks, one chooses the oldest one: this is done by git rev-list --topo-order --reverse --no-walk ... | head -n 1 below (note that because of --reverse, one cannot use -1 or -n 1 as a git rev-list option instead of the pipe to head -n 1).

So, here is a complete solution, usable as a script, with master and HEAD as defaults for the branches:

git rev-list --topo-order --reverse --no-walk \
  $(diff -u <(git rev-list --topo-order --reverse "${1:-master}") \
            <(git rev-list --topo-order --reverse "${2:-HEAD}") | \
      perl -e 'while (<>) { /^([-+])([^-+].*)/ and $h{$1} //= "$2^@\n";
                            last if %h > 1 }
               print values %h') | head -n 1

The --topo-order description is not much detailed, but this appears to work as expected on various complex examples. But it might also be possible to have even more complex examples where the branch point is not well defined.

Capitulation answered 5/8, 2023 at 2:23 Comment(3)
Interesting... I think this only works in limited circumstances. At the very least, it fails to work on the test repo I set up and link to in my answer -- it there gives 37ad159, which isn't on topic, instead of 6aafd7f, which is the divergence point. I think this basically only works because of a weird artifact in the state of the mpfr tree... though I concede it does give a useful result there.Fiat
@Fiat I need to revise my answer. First --topo-order needs to be added to both git rev-list instances (I did some tests where all the commits had the same date, so that I didn't see that --topo-order was needed). Then I've found how to avoid the issue with the ubf2 example, but I need to find how I can simplify the command in this case.Capitulation
@Fiat I have corrected my solution by adding --topo-order, and I have proposed a second solution, which works as expected on all the examples I have tested. In particular, with your test repo, it gives 6aafd7f as wanted.Capitulation
W
0

The problem appears to be to find the most recent, single-commit cut between both branches on one side, and the earliest common ancestor on the other (probably the initial commit of the repo). This matches my intuition of what the "branching off" point is.

That in mind, this is not at all easy to compute with normal git shell commands, since git rev-list -- our most powerful tool -- doesn't let us restrict the path by which a commit is reached. The closest we have is git rev-list --boundary, which can give us a set of all the commits that "blocked our way". (Note: git rev-list --ancestry-path is interesting but I don't how to make it useful here.)

Here is the script: https://gist.github.com/abortz/d464c88923c520b79e3d. It's relatively simple, but due to a loop it's complicated enough to warrant a gist.

Note that most other solutions proposed here can't possibly work in all situations for a simple reason: git rev-list --first-parent isn't reliable in linearizing history because there can be merges with either ordering.

git rev-list --topo-order, on the other hand, is very useful -- for walking commits in topographic order -- but doing diffs is brittle: there are multiple possible topographic orderings for a given graph, so you are depending on a certain stability of the orderings. That said, strongk7's solution probably works damn well most of the time. However it's slower that mine as a result of having to walk the entire history of the repo... twice. :-)

Whittle answered 1/2, 2016 at 0:7 Comment(1)
Your intuition is reasonable, but histories can exist with no such single cut (even with a single root). Consider the union of the linear histories ABCDEFG, BHIJKG, DLMN, and IOPN: the heads G and N diverged at D and I with perfect symmetry (disregarding parent order).Nonstriated
M
0

The following implements git equivalent of svn log --stop-on-copy and can also be used to find branch origin.

Approach

  1. Get head for all branches
  2. collect mergeBase for target branch each other branch
  3. git.log and iterate
  4. Stop at first commit that appears in the mergeBase list

Like all rivers run to the sea, all branches run to master and therefore we find merge-base between seemingly unrelated branches. As we walk back from branch head through ancestors, we can stop at the first potential merge base since in theory it should be origin point of this branch.

Notes

  • I haven't tried this approach where sibling and cousin branches merged between each other.
  • I know there must be a better solution.

details: https://mcmap.net/q/12360/-what-39-s-the-jgit-equivalent-for-quot-git-merge-base-fork-point-brancha-branchb-quot

Monohydroxy answered 12/2, 2016 at 0:53 Comment(0)
G
0

Why not use

git log master..enter_your_branch_here --oneline | tail -1

Which gives you all of the commits that Branch A has that master doesn't have (the function of ..), and tail -1 to return the last line of output, which would find you the first commit of the specified branch (Branch A).

Then, with that commit's SHA

git log enter_the_sha_here^1 --oneline | head -1

Which gives you all the commits prior to the specified commit (the function of ^1) and head -1 to return the first line of output, which is "one commit prior" to the earliest commit in the Branch A, aka the "branch point".


As a single, executable command:

for COMMIT in $(git log --format=format:%H  master..HEAD | tail -1) ; do
    git log $COMMIT^1 --oneline | head -1
done

Run the above from within Branch A (the function of HEAD)

Gavage answered 14/5, 2022 at 0:0 Comment(0)
L
0

Simple Answer

Merge-base with the two branches, now you can find the common ancenstor(s).

git merge-base master branch_a
# =>  commit_1234

# now do what you want with the commit id

git show commit_1234

# => commit commit_1234
# Author: BenKoshy
# Date:   Today
# etc

```

Lunar answered 9/8, 2023 at 2:54 Comment(0)
S
-2

You can examine the reflog of branch A to find from which commit it was created, as well as the full history of which commits that branch pointed to. Reflogs are in .git/logs.

Scaler answered 6/10, 2009 at 23:42 Comment(1)
I don't think this works in general because the reflog can be pruned. And I don't think (?) reflogs get pushed either, so this would only work in a single-repo situation.Cabbagehead
M
-2

You could use the following command to return the oldest commit in branch_a, which is not reachable from master:

git rev-list branch_a ^master | tail -1

Perhaps with an additional sanity check that the parent of that commit is actually reachable from master...

Mu answered 16/4, 2012 at 13:5 Comment(1)
This doesn't work. If branch_a gets merged into master once, and then continues, the commits on that merge would be considered part of master, so they wouldn't show up in ^master.Ginetteginevra
G
-2

I believe I've found a way that deals with all the corner-cases mentioned here:

branch=branch_A
merge=$(git rev-list --min-parents=2 --grep="Merge.*$branch" --all | tail -1)
git merge-base $merge^1 $merge^2

Charles Bailey is quite right that solutions based on the order of ancestors have only limited value; at the end of the day you need some sort of record of "this commit came from branch X", but such record already exists; by default 'git merge' would use a commit message such as "Merge branch 'branch_A' into master", this tells you that all the commits from the second parent (commit^2) came from 'branch_A' and was merged to the first parent (commit^1), which is 'master'.

Armed with this information you can find the first merge of 'branch_A' (which is when 'branch_A' really came into existence), and find the merge-base, which would be the branch point :)

I've tried with the repositories of Mark Booth and Charles Bailey and the solution works; how couldn't it? The only way this wouldn't work is if you have manually changed the default commit message for merges so that the branch information is truly lost.

For usefulness:

[alias]
    branch-point = !sh -c 'merge=$(git rev-list --min-parents=2 --grep="Merge.*$1" --all | tail -1) && git merge-base $merge^1 $merge^2'

Then you can do 'git branch-point branch_A'.

Enjoy ;)

Ginetteginevra answered 27/5, 2012 at 11:50 Comment(3)
Relying on the merge messages is more fragile than hypothesising about parent order. It's not just a hypothetical situation either; I frequently use git merge -m to say what I've merged in rather than the name of a potentionally ephemeral branch (e.g. "merge mainline changes into feature x y z refactor"). Suppose I'd been less helpful with my -m in my example? The problem is simply not soluble in its full generality because I can make the same history with one or two temporary branches and there is no way to tell the difference.Almucantar
@CharlesBailey That is your problem then. You shouldn't remove those lines from the commit message, you should add the rest of the message below the original one. Nowadays 'git merge' automatically opens an editor for you to add whatever you want, and for old versions of git you can do 'git merge --edit'. Either way, you can use a commit hook to add a "Commited on branch 'foo'" to each and every commit if that's what you really want. However, this solution works for most people.Ginetteginevra
Did not work for me. The branch_A was forked out of master which already had lot of merges. This logic did not give me the exact commit hash where branch_A was created.Haunt

© 2022 - 2024 — McMap. All rights reserved.