Git workflow to audit commits without rewriting history
Asked Answered
B

7

7

I am working with a developer that is new to git, and I would like to setup a git workflow that would let me auditing the commits made by this developer (and possibly reject them) without forcing him to rebase his work (which is error prone for new users).

Here is the scenario :

  • the master branch contains only audited code
  • devel branch is created by forking master branch
  • a developer is working on the devel branch and push his code for me to audit
  • I detect some issues with his code, so I ask him to make more commits on devel branch to fix the issue
  • Once I'm happy with the code, I cherry-pick (or merge with squash) the developers commits on the master branch, add my own comments and mark the commit as authored by the developer
  • The developer merge back the master branch into his devel branch

Here is a visual illustration of this scenario :

Visual scenario

After this sequence, how can I be 100% sure that mistakes made by the developer during the "make mistakes / redo" sequence will not show up again when the developer will push his devel branch again after some other commits ?

The best solution would be to ask the developer to rebase his devel branch against the master one, but this is a hard task to new git users.

If this workflow is wrong in any way, please suggest me another where I would be able to audit the commits before merging them into the master branch.

EDIT

A friend suggested me a direction that looks promising : the --fixup option when doing commit.

--fixup=<commit>

Construct a commit message for use with rebase --autosquash. The commit message will be the subject line from the specified commit with a prefix of "fixup! ". See git-rebase for details.

This option could be used by the developer to properly mark his second and third commits as a fix for his first one. A good solution to explore...

Beanie answered 8/4, 2014 at 15:53 Comment(1)
Side note, it should be unnecessary to "mark the commit as authored by the developer", since commits already have an "author date" and a "commit date". You should be marked as the "committer", not the author, automatically.Guardrail
A
8

Remark 1: By trying to turn the separated commits on devel into a single commit on master, you are rewriting history. I see the point in trying to keep a clean, linear history on master, however a less linear history with merges would be more easily integrated with git basic commands.

Remark 2: If your workflow explicitly includes members which are not familiar with the basic tools you use to share code, you should live with the consequences. At some point, I think the developer should learn git.

Here are two suggestions :

  1. Drop the linear history on master, and "validate" the changes by running, from master :

    git merge --no-ff devel
    

    This is the most straightforward way to inform git that the modifications on devel are taken into account on master.

  2. Do the merge with devel yourself, and have your developer pull the modifications from origin/devel.
    If the single commit on master really only consists in squashing the commits on devel, this shouldn't trigger any conflicts -- the merge commit master->devel should introduce 0 modifications.
    If you have actually modified the squashed commit, and the developer can have local modifications of his own, you have to face the possibility of triggering conflicts on his side anyway ...

Antiphlogistic answered 15/4, 2014 at 14:42 Comment(1)
Thanks for your remarks and suggestions LeGEC. I definitely will merge the devel branch myself (as suggested in 2), but in master, not in devel. Then, the developer would just have to merge master into devel, which should go without conflict because I would have already resolved it. I will use your suggestion to just do a straight merge --no-ff if I could not find a better way to make commits nicer.Beanie
C
2

If you're going to just squash the dev branch instead of merging it normally, it's not a good idea to then merge that squashed commit back into the dev branch, because then you'll get a ton of confusing conflicts, because git thinks you're applying new code, when in fact it's actually the same code.

It would be better to either just do a simple merge into master, then a simple merge back into devel, or otherwise throw-away the devel branch and make a new one off of the most recent commit of master. As long as the code that you merge into master is "good", according to you, then you probably won't get any regressions.

On the other hand, if you go with your original plan of merging in squashed commits back into devel, that will increase your chances of a regression occurring, because of the confusing conflicts.

Update

I deleted parts of my answer above because I actually just tested doing the squash commit on master and then immediately merging it back into devel, and it didn't cause any conflicts, so that part of my answer was incorrect.

I turned this into answer into a community wiki because it still contains a lot of information about the problem in the comments.

Cotillion answered 8/4, 2014 at 15:53 Comment(7)
OK, thanks for the explanation. +1 because this sounds a good reason for not using the workflow I described in my question. But then, which one to use ?Beanie
It's up to you. Use whichever method is easiest for you and your developer? If you do the squash merge, your other developer has to know how to delete or hard reset his local devel branch to the updated master branch. Wait...why couldn't you just rebase devel for your other developer and have him just hard reset his local branch to that? That's effectively the same as doing the squash merge.Guardrail
I cannot do that because I am not aware of the possible unpushed changes that the developer has in his local repo. A hard reset will make the developer loose those changes unless he knows how to put them on another branch ... which gets very complicated.Beanie
@FabienQuatravaux please clarify in your question how you have your repos and branches set up. A typical workflow for many people is to have each developer have their own remote fork, in addition to a local repo. When they're done with a branch, they just pushes it to their remote fork for review. If you don't have unique remote forks available per developer, you could just have your developer push their work to their own branch on your shared remote, waiting for review. You could even namespace the branch, like call it john/devel on the remote and local repos.Guardrail
Let's make it simple : I have only one developer, so no namespace is needed, and one remote repo. The developer pushes the changes on the devel branch on the remote repo for review. My issue is : after the review (that perhaps needed several other commits) how to integrate the changes into the master branch without getting all the mistakes made in between.Beanie
Hi @FabienQuatravaux, I'm not sure that I can provide enough expertise to help you solve your problem. I think your question is going to become available for bounty soon, so you should consider putting a bounty on it so that it receives more attention.Guardrail
Sorry @FabienQuatravaux, haven't had time to take a close look at your edit yet. However, I did test out your original plan of doing a squash commit, and it actually didn't produce any conflicts, so that part of my original answer was wrong.Guardrail
E
2

Just use gerrit. With gerrit you will have the history of your comments, commits etc. You can also abandon change. The workflow I use as a developer:

Prepare

Copy gerrit commit-message hook to your repo/.git/hooks. Make sure it's executable and doesn't contains crlf.

Make remote review for pushing, or better shell alias if you work on *nix.

Make alias for rebasing in ~/.gitconfig (about fork-point)

[alias]
        reb = !"git rebase -i $(git merge-base --fork-point origin/master HEAD)"

Workflow

while unapproved do
    git commit -a --amend
    git push review
    # reviewers comment, reject, approve etc.
    # If rejected, I change commit as required, then
end

Squashing

When pushing to gerrit there should (probably) be only one commit.

So to squash commits on your branch into one:

git reb

Editor will open. Change pick to squash for every line except first (oldest commit). Save and quit. Editor will open with commit messages. Delete everything except oldest commit message and the line with Commit-ID (it's needed for gerrit).

Rebasing

If merging your branch fails in gerrit, you can rebase it onto origin/master

git rebase origin/master

If there are many conflicts, you can avoid pain of resolving many of them just by using cherry-pick

git checkout -b topic2 origin/master
git cherry-pick topic
git branch -D topic
git branch -m topic

git review

You can also use git review to simplify some operations when working with gerrit.

Electuary answered 15/4, 2014 at 14:54 Comment(5)
Interesting, but far too complex for the workflow I want to put in place. As I said to Dennis Poort, the steps you are describing here will force the developer to be very careful about where he is in his local repo before commiting. And if there are two subjects ongoing at the same time, he will have to use one branch per subject. For a git beginner, this is too hard. The workflow I need on the developer side is 1) develop, 2) test, 3) submit for review, then go back to 1).Beanie
I think one branch per subject is a must. I can't imagine working without it now. You and your developer will have much greater view on what's going on if you use gerrit and branches.Electuary
Also you can add branch name to prompt, so that developer will know what branch he is at #15883916Electuary
I agree this is great ... but this is not for beginners. Constantly switching branch depending on the topic your are working on can be very difficult for somebody that was used to commit in svn. And when a topic needs another one to work but this other one still needs fixes ... things begin to be horrible !Beanie
Keep topic small. Modularize your code so you can work independently on topics. Rebase your topic branch if there is new code on master and you need it. Resolve conflicts if any. git add resolved_conflict_file; git rebase --continueElectuary
O
0

If you make sure your other developer always creates changes from a certain branch, say develop, you can have main branch setup that always exist for example:

master, develop

If your developer only creates feature branches from the develop branch he can create commits all he wants. Then when you think it is ready you can cherry-pick/rebase and do your thing with it. The feature branch gets deleted making sure rewritten history is gone.

All your developer needs to remember would be to create a branch from the remote develop branch everytime he wants to start a new feature and not touch his local feature branch when you say you're going to merge some stuff into your develop.

Then when it is ready he deletes his myfeature branch locally and you delete it remote when you've merged/rebased/cherry-picked his good changes into your develop.

Only the good commits go into develop branch. Your developer creates new branch from develop like before.

I think some basic branching is the least your developer should know. If he cannot rebase/clean his branch/commits I think you have to agree to follow a certain workflow.

Maybe not perfect, but perhaps a possible workflow nonetheless. Hope it helps.

Oiler answered 11/4, 2014 at 7:48 Comment(2)
I noticed that the main issue encountered by new git users was to make the difference between the HEAD, the index and the working tree. The workflow for those users is 1) make changes to the code 2) test the changes 3) "save" to git. Your proposal would force the developer to think a lot about where he is in his local repo before beginning to work efficiently. I need a workflow that only apply to the third stage : save changes to git.Beanie
In my humble opinion @FabienQuatravaux I question the skill of a developer that would not be able to follow a few basic steps. No offence or bash intended. If you do plan on a long partnership, you should also think about your own extra efforts that might tire you down in the long run. Curious if your --fixup workflow will be acceptable though. Good luck!Oiler
E
0

The easiest way with the least amount of fuss would just be to merge in the changes from the devel branch into master. You just need to regularly update the branch with master so that your branches don't diverge too wildly (depending on what features other people are working on).

If you want to have one single commit for the modification on master. You can rebase the developers commits into one single commit and then cherry-pick that commit onto master. This would lead to complications as you would only be able to do git push -f to update the remote devel branch and if your developer is not familiar with git would be complicated for updating their devel branch. Or you would require that they delete that branch and then use a new branch for any future work.

You will need to decide which option to go on based on whether or not you want to have all the commits from your developer showing up in the log. The easiest and least troublesome way is just to simply merge devel and master.

Elmerelmina answered 15/4, 2014 at 15:26 Comment(1)
I can have one single commit for the modification on master without rebasing the devel branch : I can cherry-pick -n and then commit --amend into master. As you said, git push devel -f is prohibited because the developer will not understand how to put his local changes back into the rewritten devel branch.Beanie
B
0

I will test the following workflow and let you know if things are going well.

Prerequisites

  • developer must be able to commit from the command line
  • developer should always stay on the same branch, so that he does not need to worry about which branch is currently checked out
  • developer should never have to solve conflicts that impact code he does not have written
  • developer will never be in a situation where he could lost local modifications to its working tree

Tools

git commit with the --fixup option :

--fixup=<commit>

Construct a commit message for use with rebase --autosquash. The commit message will be the subject line from the specified commit with a prefix of "fixup! ".

git rebase with --autosquash, --autostash and --interactive options

--interactive

Make a list of the commits which are about to be rebased. Let the user edit that list before rebasing. This mode can also be used to split commits.

--autosquash

When the commit log message begins with "squash! ..." (or "fixup! ..."), and there is a commit whose title begins with the same ..., automatically modify the todo list of rebase -i so that the commit marked for squashing comes right after the commit to be modified, and change the action of the moved commit from pick to squash (or fixup). Ignores subsequent "fixup! " or "squash! " after the first, in case you referred to an earlier fixup/squash with git commit --fixup/--squash.

This option is only valid when the --interactive option is used.

If the --autosquash option is enabled by default using the configuration variable rebase.autosquash, this option can be used to override and disable this setting.

--autostash

Automatically create a temporary stash before the operation begins, and apply it after the operation ends. This means that you can run rebase on a dirty worktree. However, use with care: the final stash application after a successful rebase might result in non-trivial conflicts.

Workflow

  1. The developer fork the master branch into a devel branch. The fork point is marked with a tag (for exemple version_1.1)
  2. The developer work and commit on the devel branch. When the work is done, he pushes his branch on the remote repository so that the auditor can see and validate it.
  3. The auditor fetch the devel branch and look at the code. If something is not correct, he asks the developer to make some modifications.
  4. For each modification done by the developer, the following command should be used to commit the correction : git commit --fixup <commit that was not OK>. Than the branch is pushed again.
  5. Once the auditor is happy with the modifications done, the developer must rebase his branch in order to squash all the fixup commits into one beautiful and error-free commit (other unrelated work could have been commited meanwhile).

GIT_EDITOR=: git rebase tags/version_1.1 --interactive --autosquash --autostash.

The GIT_EDITOR=: is here to avoid to display the editor that present the commit lists due to the --interactive option. This option is needed to make the fixup commit reorder automatically.

The last step is to push the devel branch again (with --force option) so that the auditor can merge it in the master branch. The rebase base for next devel fixup could be the master branch or a new tag that the developer put on his devel branch after merging back the master branch.

Here is the resulting git tree :

gitk illustration of the fixup workflow

Beanie answered 17/4, 2014 at 15:52 Comment(6)
I see this workflow as deeply flawed for what you're trying to accomplish. Your stated goal is to audit commits to make sure that they're "good" before they're merged into master. However, after auditing the commits, you ask the developer to rebase them again before they're merged. Two things that are problematic with that. 1st, you don't verify that the rebase is correct before merging. You should always merge immediately after verifying correctness.Guardrail
2nd, you shift responsibility for rebasing to the developer. If you don't trust the developer to manage branches correctly, why would you trust him to do a rebase correctly, even if you automate most of the steps? If the developer doesn't really understand how to use rebase, there will always be a chance that there will be mistakes made, no matter how much you try to minimize it. You're better off executing the rebase yourself in that case.Guardrail
I think you did not understand the process I propose. The rebase is not done against the master branch, the rebase is done locally against the developer own branch. With this technique, there is no conflict and no error can be made : it's just a matter of reorganizing and squashing commits. A proof of that is : if you try to use this rebase command without the --interactive option, nothing will happen. Of course, once the rebase is done by the developer, he pushes the devel branch in order for me to audit the final changes. Then, if I am happy, I merge the devel branch myself.Beanie
I understood that the developer was merely squashing commits on the devel branch, there was no misunderstanding of your proposed process. I definitely recommend that the final rebased results be verified again before merging though. It should be a simple git diff to make sure that there are no unexpected changes.Guardrail
@FabienQuatravaux, "developer should never have to solve conflicts that impact code he does not have written" It seems like wishful thinking. Conflicts occure because git can't easily merge your code, because you modfied the same portion of code that someone else. Git uses some algorithm how to merge. As I said before: use smaller topics, modular design.Electuary
@FabienQuatravaux, you want your developer to always use devel branch. What if you want him to do some hotfix? I agree that it's nice to concentrate on one task, but this won't protect you from merge/rebase conflicts. Also you focused to much on this fixup thing. What if developer forgets to use this? With my method he just run git reb and see what will be rebased. He can delete unneeded commits or even reedit them.Electuary
G
0

Well, if you have one beginner developer and you are the only reviewer, i suggest this workflow:

  1. the developer create a local branch from master to implement some feature (branch name: i-123)
  2. the developer add some code to branch i-123
  3. the developer create merge request to merge i-123 on master branch
  4. the reviewer find some mistake and notifies the problem to developer
  5. the developer add commits to solve problem and notifies to reviewer to check change of i-123 branch
  6. the reviewer verifies the source code and accept merge request (merge i-123 on master branch as result i-123 was destroyed)
  7. the developer sync local repository (as result: master branch contain change relative to i-123 and i-123 was destroyed)
  8. the developer can continue with step 1

This approach allow that one developer work on one or various features at same time, and eliminate complicate logic about mantein updated develop branch. It's easy approach to the beginner developer.

Goble answered 17/4, 2014 at 17:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.