What is a dangling commit and a blob in a Git repository and where do they come from?
Asked Answered
L

5

243

I'm looking for the basic information on dangling commits and blobs.

My repository seems fine. But I ran git fsck for the first time to see what it did and I have a long list of 'dangling blobs' and a single 'dangling commit'.

What are these things? Where did they come from? Do they indicate anything unusual (good or bad) about the state of my repository?

Learn answered 29/8, 2013 at 15:8 Comment(0)
A
148

During the course of working with your Git repository, you may end up backing out of operations, and making other moves that cause intermediary blobs, and even some things that Git does for you to help avoid loss of information.

Eventually (conditionally, according to the git gc man page) it will perform garbage collection and clean these things up. You can also force it by invoking the garbage collection process, git gc.

For more information about this, see Maintenance and Data Recover on the git-scm site.

A manual run of GC will by default leave two weeks prior to the runtime of this command as a safety net. It is in fact encouraged to run the GC occasionally to help ensure performant use of your Git repository. Like anything, though, you should understand what it is doing before destroying those things that may be important to you.

Alexis answered 29/8, 2013 at 15:29 Comment(7)
So it is fair to say that 1) unless I think there is some thing wrong with my repo it's safe to remove these with git gc, and 2) I don't need to worry about this at all because these dangling bits are normal and git already handle's them?Learn
That would be a fair assessment.Alexis
Also, any time you 'git add' a file, but don't commit that exact version of the file, you end up with a dangling blob. Nothing to be worried about.Continually
doub1ejack - Generally speaking you shouldn't be running garbage collection manually. It is a bad habit to get into and git does garbage collection when needed anyways. The disadvantage to running it manually is that you lost the ability to recover dangling blobs and commits that you may not want now but you might want in the future. Once you run garbage collection you take away some pretty powerful revert functionality from git. Use with caution and as the exception, not the rule. --- Just let git do its thing.Ormiston
Sorry to say, but I have 2 dangling blobs, that are very persistent, and I don't know what they are - and which 'git gc', even 'git gc --aggressive' didn't remove. Obviously the answer probably covers SOME scenarios for the creation of dangling blobsDennie
@MottiShneor I was hoping that "even some things that git does for you to help avoid loss of information" would cover the rest of the blog creation. But you may be right, that may still still cover only some of the ways to create these dangling blobs.Alexis
FYI to run garbage collection INCLUDING anything from the past 2 weeks run git gc --prune="0 days"Ploce
O
143

Dangling blob = A change that made it to the staging area/index, but never got committed. One thing that is amazing with Git is that once it gets added to the staging area, you can always get it back because these blobs behave like commits in that they have a hash too!!

Dangling commit = A commit that isn't directly linked to by any child commit, branch, tag or other reference. You can get these back too!

Ormiston answered 6/3, 2014 at 13:44 Comment(4)
However, the correct answer would be to remove "or by any of its *scendants" completely. A commit is dangling (git-scm.com/docs/gitglossary/…) only if it has no direct references to it at all, so it cannot have descendants. If you add "or by any of its descendants" you're defining unreachable (git-scm.com/docs/gitglossary/#def_unreachable_object).Compensatory
Strictly speaking, if I've got a detached HEAD, then create a commit, is that new commit already considered dangling or only as soon as I move HEAD to another commit/branch?Victoria
How is a "Stash" considered? a commit? a blob?Dennie
@MottiShneor A stash is not dangling. It has a name (the stash pointer), and it is a set of commits, just like any other commit.Fulgurous
A
82

HOWTO remove all dangling commits from your Git repository from https://web.archive.org/web/20210116144915/https://tekkie.ro/news/howto-remove-all-dangling-commits-from-your-git-repository/:

git reflog expire --expire=now --all
git gc --prune=now

Make sure you really want to remove them, as you might decide you need them after all.

Attar answered 28/10, 2016 at 11:42 Comment(2)
In reality, most users should never need this and if they do it is probably for a programmatic use case. The disk space saved or speed increased by removing dangling commits isn't worth the effort in my opinion.Ormiston
This answers a different question.Ormiston
K
29

A dangling commit is a commit which is not associated with reference, i.e., there is no way to reach it.

For example, consider the diagram below. Suppose we delete the branch featureX without merging its changes, then commit D will become a dangling commit because there is no reference associated with it. Had it been merged into master, then HEAD and master references would have pointed to commit D and it would not be dangling anymore, even if we deleted featureX. Read the note after the diagram to understand this better.

Git automatically garbage collects i.e. disposes dangling commits. We can use the git reflog to recover a branch (of dangling commits) which was deleted without merging it. We can recover deleted commits only if it is present in local object store. If it was garbage collected, then we can't recover it.

enter image description here

NOTE that a branch name i.e. a branch label is actually a reference to the latest commit on a branch or the tip of the branch. In the diagram above, featureX, master and HEAD are just references to specific commits. featureX and master labels refer to latest commits on their respective branches. HEAD generally refers to the tip of the currently checked out branch (master in this case). If you checkout an older commit on your current branch, then HEAD will be in a detached state, i.e., it will point to the older commit instead of the latest one. Also note that HEAD is called a symbolic reference because it actually points to the current branch label and any branch label always points to the tip of the branch. So, under normal circumstances, HEAD indirectly points to the latest commit.

As an aside, note that Git represents its commit graph/history as a directed acyclic graph. Each commit has a reference to its parent. Hence, the arrows in a commit diagram point from child commit to parent commit. We need a reference to the latest child commit in order to reach the older commits on a branch.

PS - The above diagram and understanding was obtained from this free course. Even though the course is quite old, the knowledge is still relevant.

Kofu answered 21/1, 2020 at 23:43 Comment(0)
O
11

A dangling commit also arises if you 'amend' a commit. For example you do a lot of work, test it and commit all the files and then remember you forgot to update the README file. So you quickly change that, add it, then use "git commit --amend". This creates a new commit that is linked into the history of commits, and the original commit is left dangling.

Omalley answered 10/8, 2021 at 15:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.