I'm not a developer. On one of our projects, since a lot of tickets take time to complete, we have been cherry-picking our commits, and now we have to do it very often. I was told by a developer that cherry-picking should be avoided as it makes the repo unstable. What does that mean, how does it make the repo unstable? In other words, what are the negative consequences of cherry-picking?
A typical use case for Git cherry-picking is bringing a commit from one branch into another branch, because the usual means of doing this, e.g. via a merge or rebase, are not available. The usual merge option might not be available because a feature branch is not yet ready to be merged to its source. However, a certain commit might be needed in the source branch immediately, e.g. for a bug or other hot fix, and so cherry-picking is one means to do this.
Cherry-picking does not really make the repository unstable, but what is does do it to run the risk of duplicating commits all over the place. Going back to the example of a feature branch having a commit which needs to go back to the source immediately, if we later do merge this feature branch, then we might be bringing in the same commit which was cherry-picked. As a result, the source branch could end up with multiple commits each of which functionally was supposed to do the same thing. This doesn't really make the repo unstable, but it does leave the history hard to read. In the future, readers of the history may find it difficult to figure out what the contributors were doing such that duplicated commits arose.
At the root of this problem is that Git cherry-pick actually creates a new commit, with a new SHA-1 hash, each time it is used. So, cherry-picking a single commit from a feature branch into the source actually leaves the repository with two commits, functionally identical, but with completely different SHA-1 hashes.
To be clear, cherry-picking won't harm your repository. Git is fine with cherry-picking. Cherry-picking might make your code unstable.
A cherry-pick is basically copying a commit to another branch. Used carefully this is a very useful tool. Used sloppily and you're copying untested code around. If you find yourself having to use cherry-pick a lot there's probably something sub-optimal about your process.
A typical example is when you have a large feature branch which also fixed a bug. That feature is taking a long time to finish, but you need that bug fix now. (The deeper question is why is that feature branch taking so long? Is it too big? Can it be chopped up into a series of smaller features?)
Your repository looks like this.
A - B - C - D - E [master]
\
1 - 2 - bugfix - 3 - 4 - 5 [feature]
What happens next depends on your workflow. You could cherry pick it straight onto master
.
git cherry-pick bugfix
A - B - C - D - E [master]
\
1 - 2 - bugfix - 3 - 4 - 5 [feature]
This has all the problems with committing untested code straight to master
. It might depend on some other piece of feature
. It might just not work. It might introduce more subtle bugs. It might be incomplete. This is probably what they're referring to by "making the code unstable".
Better is to follow a "feature branch" work flow. No direct commits to master
are allowed. Everything must be done in a branch. Branches go through QA before being merged. This ensures master
is always kept in a known good state and nobody is sharing untested, low quality code.
You'd open a new branch for the bug fix and cherry pick it in.
git checkout -b fix/bug
git cherry-pick bugfix
bugfix' [fix/bug]
/
A - B - C - D - E [master]
\
1 - 2 - bugfix - 3 - 4 - 5 [feature]
Then fix/bug
is run through the normal QA process. Any problems are fixed. When it passes QA it is merged into master
. Let's say there was a problem, so there's another commit.
git checkout master
git merge fix/bug
git branch -d fix/bug
bugfix' - F
/ \
A - B - C - D - E ----------- G [master]
\
1 - 2 - bugfix - 3 - 4 - 5 [feature]
Now feature
should update itself from master
to make sure it has the complete bugfix. There might be conflicts between master's version of the bugfix and its own. Fix them as normal.
git checkout feature
git merge master
bugfix' ---- F
/ \
A - B - C - D - E -------------- * [master]
\ \
1 - 2 - bugfix - 3 - 4 - 5 - * [feature]
Then once feature
is complete it can be merged into master
as normal. Git doesn't care that there's two versions of the bugfix in the history, any issues were already resolved in the update merge.
git checkout master
git merge feature
git branch -d feature
bugfix' ---- F
/ \
A - B - C - D - E -------------- * --------- * [master]
\ \ /
1 - 2 - bugfix - 3 - 4 - 5 - * - 6 - 7
Side note: if instead of merging you're using rebase
to update your branches, my preference, Git might even remove the bugfix commit entirely if it thinks its redundant.
git checkout feature
git rebase master
bugfix' - F
/ \
A - B - C - D - E --------- - * [master]
\
1 - 2 - 3 - 4 - 5 [feature]
and you're copying untested code around
... why would cherry picking have anything to do with whether or not the code be tested? –
Cartography © 2022 - 2024 — McMap. All rights reserved.