Merging branch back with clean history
Asked Answered
S

2

6

I am working on a branch (with others) off of master.

A - B - C - F - G    (master)
         \
          D - E      (branch-a)

Periodically, we merge master into the branch, to minimize conflicts later.

A - B - C - F - G    (master)
         \       \
          D - E - H  (branch-a)

Eventually, we will want to merge back.

A - B - C - F - G - I - K - M - N  (master)
         \       \       \     /
          D - E - H - J - L - O    (branch-a)

What is the cleanest way I can merge back into master?

  1. I want to preserve the individual commits (i.e. no squash)
  2. I will not use branch-a any longer, so commits hashes can change.
  3. I would like to not include merge commits (e.g. H, L) for merges without conflicts, if possible.

Ideally, it would look like this (assuming there were no conflicts):

A - B - C - F - G - I - K - M - N  (master)
         \                     /
          D  -  E   -   J  -  O

Any ideas on how this should be done?

(FYI, this is a question about my workflow. If I should have done something else earlier, that is a legitimate answer too.)


UPDATE:

After thinking about this some more, I realized this is often not possible.

For example, if J changed a line that G changed, there would not be a way to get that history.

The next best choice would be to have this history:

A - B - C - F - G - I - K - M - D - E - J - O  (master)

Essentially, this is a rebase, but omitting unnecessary merge commits.

Shippee answered 10/3, 2014 at 15:42 Comment(9)
Hm.. in the last graph, shouldn'y you remove the H and L completely, as they were the merge-commits?Audio
@quetzalcoatl, you are correct.Shippee
Anyways, regarding #3: the merge commits are the thing that allow both you and the Git to reduce the amount of history that one has to scan. They may seem not very important, especially when there were no conflicts, but they mark the "timepoints" where the branches were synced. When you remove them like in the last example (without H and L), note how the graph was "squashed" to the left. It is no longet visible that the commit "O" was between K and MAudio
Surely, it's may be possible to deduce that information from timestamps, but taking any clock-desyncs between machines, and the fact someone may work on that branch remotely and may not have all the things commited, it gets quite complex. Of course, it's now quite hard to get a relevant clock-desync that's bigger than a few seconds :))Audio
@quetzalcoatl, good point, though I am willing to loose that information about the relative order of commits.Shippee
Anyways.. note that non-collision merges are not non-destructive. The small fact that you've got a merge H could have made the merge L automatic and non-colliding, too. It's because the 'H' state has been marked as verified, and some lines were, um, merged. If you erase the information, and if between now-missing-H and incoming-L the lines were changed again, then I'd guess that at L will be a much higher change of a conflict. If you now multiply that for N not-very-important non-colliding merges, it will scale to a sure conflict.Audio
Please, take care: I'm not putting this as answer, I'm not highly experienced Git-used, and I may say nonsense, but that's how I view the quick&small merges that are usually recommended. Maybe the Git history/diff algorithms are far superior, and maybe removing that commit doesn't hurt. Surely, when they are "disjoint" in the sense of files/lines changes, it won't hurt. But I just think it could hurt in general sense, when there's more of them. So, why risk it.Audio
Yeah, your edit about G and J perfectly summarized what I was trying to tell :)Audio
H & L are very important. Resolving merges is part of your changes and even if the merges were trivial, the empty set of changes caries important metadata. They need to be included. Also you want to merge M down, before you merge up, to make the merge resolution part of your branch and not part of the main branch, it makes it easier to backport your changes later to a parallel branch.Cahoon
U
0

You can do this by hand by branching off E and cherry-picking the other commits into the new branch, then merging that into master. Rebasing is also possible, i.e. do

git rebase C

on branch-a. However, that will take commits from master and put them on your feature branch.

git rebase -i C

has the same effect, but allows you to skip commits that you are no are from master and are not required on branch-a. Git cannot in general know whether commits interact in any way (e.g. a change to one file might require a change to a different file that was done on master), so there's no fail-safe, fully automatic solution to this problem.

Ungainly answered 10/3, 2014 at 16:9 Comment(0)
P
0

you could do this when you merge master into your working branch

 git pull --rebase

when your commits are done on your working branch and you want to pull code from master into your branch. You could also configure this behaviour as default on certain branches by doing

git config branch.master.rebase true

This will ensure that your commits are always reapplied so your history says linear. That is, your commits will be at the top when you eventually raise a pull request and the history will be clean with no un-necessary merge commits polluting it. Another advantage is, it is easier to cherry-pick if you want to maintain two or more branches in sync.

Procrustes answered 10/3, 2014 at 16:26 Comment(2)
I cannot do that during development since I am sharing this branch with others during that time. Rebasing would change the commit hashes, causing problems when others tried to pull on that branch.Shippee
Oh I see. Well, I am going to let the answer remain here with the hope that someone else searching for a better workflow finds it useful. Do you want me to remove it?Procrustes

© 2022 - 2024 — McMap. All rights reserved.