How can I fix a Git error broken link from tree to tree?
Asked Answered
W

6

21

I got a transaction interrupted and when I try again I was having error with objects that were empty or corrupted, following another question I delete all the empty files and when I run

git fsck --full

I got this error:

Checking object directories: 100% (256/256), done.
Checking objects: 100% (48774/48774), done.
error: d193ccbc48a30e8961e9a2515a708e228d5ea16d: invalid sha1 pointer in cache-tree
error: df084ac4214f1a981481b40080428950865a6b31: invalid sha1 pointer in cache-tree
broken link from    tree 4bf4869299b294be9dee4ecdcb45d2c204ce623b
          to    tree df084ac4214f1a981481b40080428950865a6b31
broken link from    tree 4bf4869299b294be9dee4ecdcb45d2c204ce623b
          to    tree d193ccbc48a30e8961e9a2515a708e228d5ea16d
missing tree df084ac4214f1a981481b40080428950865a6b31
missing blob a632281618ca6895282031732d28397c18038e35
missing tree d193ccbc48a30e8961e9a2515a708e228d5ea16d
missing blob 70aa143b05d1d7560e22f61fb737a1cab4ff74c6
missing blob c21c0545e08f5cac86ce4dde103708a1642f23fb
missing blob 9f341b8a9fcd26af3c44337ee121e2d6f6814088
missing blob 396aaf36f602018f88ce985df85e73a71dea6f14
missing blob 87b9d1933d37cc9eb7618c7984439e3c2e685a11

How can I fix this problem?

Git

Warmup answered 26/4, 2016 at 21:56 Comment(2)
With Git 2.10 (Q3 2016), git fsck --name-objects can help. See my answer belowBogoch
I found this because of a broken link from a recent push in main, and I couldn't cleanly pull my local main to it. I found this article: blog.pterodactylus.net/2020/10/18/… which helped me recover the missing/corrupt packs, fixing the broken links.Beekman
B
18

To address the broken link from tree to tree error in Git, you need to identify the missing objects and attempt to recover them or remove references to them.

Git Repository Structure
├─📂 .git/
│ ├─📁 objects/
│ │ ├─💔 broken link from tree 4bf48692... to tree df084ac4...
│ │ └─💔 broken link from tree 4bf48692... to tree d193ccbc...
│ └─📁 refs/
└─📄 other repository files

With Git 2.10 (Q3 2016), you can know more about the origin of those broken links.

git fsck --name-objects

See commit 90cf590, commit 1cd772c, commit 7b35efd, commit 993a21b (17 Jul 2016) by Johannes Schindelin (dscho).
(Merged by Junio C Hamano -- gitster -- in commit 9db3979, 25 Jul 2016)

fsck: optionally show more helpful info for broken links

When reporting broken links between commits/trees/blobs, it would be quite helpful at times if the user would be told how the object is supposed to be reachable.

With the new --name-objects option, git-fsck will try to do exactly that:
name the objects in a way that shows how they are reachable.

For example, when some reflog got corrupted and a blob is missing that should not be, the user might want to remove the corresponding reflog entry.
This option helps them find that entry: git fsck --name-objects will now report something like this:

  broken link from    tree b5eb6ff...  (refs/stash@{<date>}~37:)
                to    blob ec5cf80...

If those broken links don't come from a local stash but a remote repo, fetching those pack objects can then solve the situation.
See also "How to recover Git objects damaged by hard disk failure?".


With Git 2.31 (Q1 2021), fix "git fsck --name-objects"(man) which apparently has not been used by anybody who is motivated enough to report breakage.

See commit e89f893, commit 8c891ee (10 Feb 2021) by Johannes Schindelin (dscho).
(Merged by Junio C Hamano -- gitster -- in commit 9e634a9, 17 Feb 2021)

fsck --name-objects: be more careful parsing generation numbers

Signed-off-by: Johannes Schindelin

In 7b35efd (fsck_walk(): optionally name objects on the go, 2016-07-17, Git v2.10.0-rc0 -- merge listed in batch #7) (fsck_walk(): optionally name objects on the go, 2016-07-17), the fsck machinery learned to optionally name the objects, so that it is easier to see what part of the repository is in a bad shape, say, when objects are missing.

To save on complexity, this machinery uses a parser to determine the name of a parent given a commit's name: any ~<n> suffix is parsed and the parent's name is formed from the prefix together with ~<n+1>.

However, this parser has a bug: if it finds a suffix <n> that is not ~<n>, it will mistake the empty string for the prefix and <n> for the generation number.
In other words, it will generate a name of the form ~<bogus-number>.

Let's fix this.


With Git 2.40 (Q1 2023), "git hash-object"(man) now checks that the resulting object is well formed with the same code as git fsck".

See commit 8e43090 (19 Jan 2023), and commit 69bbbe4, commit 35ff327, commit 34959d8, commit ad5dfea, commit 61cc4be, commit 6e26460 (18 Jan 2023) by Jeff King (peff).
(Merged by Junio C Hamano -- gitster -- in commit abf2bb8, 30 Jan 2023)

hash-object: use fsck for object checks

Signed-off-by: Jeff King

Since c879daa ("Make hash-object more robust against malformed objects", 2011-02-05, Git v1.7.5-rc0 -- merge), we've done some rudimentary checks against objects we're about to write by running them through our usual parsers for trees, commits, and tags.

These parsers catch some problems, but they are not nearly as careful as the fsck functions (which make sense; the parsers are designed to be fast and forgiving, bailing only when the input is unintelligible).
We are better off doing the more thorough fsck checks when writing objects.
Doing so at write time is much better than writing garbage only to find out later (after building more history atop it!) that fsck complains about it, or hosts with transfer.fsckObjects reject it.

This is obviously going to be a user-visible behavior change, and the test changes earlier in this series show the scope of the impact.
But I'd argue that this is OK:

  • the documentation for hash-object is already vague about which checks we might do, saying that --literally will allow any garbage[...] which might not otherwise pass standard object parsing or git-fsck(man) checks".
    So we are already covered under the documented behavior.
  • users don't generally run hash-object anyway.
    There are a lot of spots in the tests that needed to be updated because creating garbage objects is something that Git's tests disproportionately do.
  • it's hard to imagine anyone thinking the new behavior is worse.
    Any object we reject would be a potential problem down the road for the user.
    And if they really want to create garbage, --literally is already the escape hatch they need.

Note that the change here is actually in index_mem(), which handles the HASH_FORMAT_CHECK flag passed by hash-object.
That flag is also used by "git-replace --edit"(man) to sanity-check the result.
Covering that with more thorough checks likewise seems like a good thing.

Besides being more thorough, there are a few other bonuses:

  • we get rid of some questionable stack allocations of object structs.
    These don't seem to currently cause any problems in practice, but they subtly violate some of the assumptions made by the rest of the code (e.g., the "struct commit" we put on the stack and zero-initialize will not have a proper index from alloc_comit_index().

  • likewise, those parsed object structs are the source of some small memory leaks

  • the resulting messages are much better.
    For example:

    [before]
    $ echo 'tree 123' | git hash-object -t commit --stdin
    error: bogus commit object 0000000000000000000000000000000000000000
    fatal: corrupt commit
    
    [after]
    $ echo 'tree 123' | git.compile hash-object -t commit --stdin
    error: object fails fsck: badTreeSha1: invalid 'tree' line format - bad sha1
    fatal: refusing to create malformed object
    

After identifying the broken links with git fsck --name-objects, you can proceed with the repair process:

  1. Restore Missing Objects: If you have access to a backup or a remote copy of the repository that is free from these errors, you can try to restore the missing or corrupted objects by fetching them:

    git fetch origin
    
  2. Clone a New Repository: Sometimes, cloning a fresh copy of the repository is the simplest way to resolve corruption issues:

    git clone <repository-url>
    

    Try and recover from that other repository.

  3. Manual Cleanup: After trying restoration or re-cloning, run git fsck --full again to verify the repository's integrity. Address any identified issues as suggested by the output.

  4. Remove any dangling object.

  5. Use Advanced Repair Tools: In more severe cases, where standard recovery methods fail, consider using advanced tools like git-repair. That tool attempts to fix the repository by applying more aggressive repair strategies:

    git-repair
    

    (Installation may be required.)

    git-repair starts by deleting all corrupt objects, and retrieving all missing objects that it can from the remotes of the repository.

    If that is not sufficient to fully recover the repository, it can also reset branches back to commits before the corruption happened, delete branches that are no longer available due to the lost data, and remove any missing files from the index. It will only do this if run with the --force option, since that rewrites history and throws out missing data.

Bogoch answered 26/7, 2016 at 19:7 Comment(3)
Knowing more about broken links is not the answer on how to fix them.Briquette
@Briquette True, thank you for your feedback. I have edited the answer to add the missing process.Bogoch
sudo apt install git-repair && git-repair this command works for meIncapacitate
A
8

I had a very similar problem getting; broken link from tree which was causing the error fatal: bad tree object on some git commands.

But it was fixed by running these commands:

Fixes Issue

  1. git stash clear ([optional] just removes stashes that might be broken due to rebasing or something)
  2. git reflog expire --expire-unreachable=now --all (removes dangling commits)
  3. git gc --prune=now (similar also prunes commits)

Check It's Fixed

  1. git fsck --full --name-objects (checks integrity, and should return no dangling commits or bad trees)

After that the error message fatal: bad tree object was gone! :tada:

Aholla answered 20/10, 2021 at 1:2 Comment(0)
A
6

What worked for me to fix this "broken link" error was the answer from sehe listed here in response to a question about how to fix an unable to find <insert sha1 code here> error.

Like Adam said, recover the object from another repository/clone.

  1. On a 'complete' Git database:

    git cat-file -p a47058d09b4ca436d65609758a9dba52235a75bd > tempfile
    
  2. and on the receiving end:

    git hash-object -w tempfile
    

One important addition would be that between step 1 and 2, it is important to directly transfer the file from one location to the other. In my experience, it didn't work to move the tempfile using Git push and pull.

Apropos answered 9/10, 2017 at 22:24 Comment(0)
K
3

git gc --aggressive will clean up unnecessary files and optimize the local repository.

You can verify that the problem is fixed with:

git fsck --full
Kapellmeister answered 26/4, 2016 at 23:1 Comment(1)
this does not work when there are broken links: error: Could not read xxxxxxxxx", fatal: Failed to traverse parents of commit yyyyyyyyyy,error: failed to run repackParaesthesia
H
0

Note I have my repository brambor-fork which is a fork of the official repo origin

  1. I got this whenever git fetch:

    C:\GitHub\Cataclysm-DDA>git fetch brambor-fork
    Auto packing the repository in background for optimum performance.
    See "git help gc" for manual housekeeping.
    Enumerating objects: 17718, done.
    Counting objects: 100% (1818/1818), done.
    fatal: unable to read ff4619146ebcafe968cdfcd1689eec7740b68b43
    fatal: failed to run repack
    error: task 'gc' failed
    
  2. I followed VonC's answer. From testing around I got:

    C:\GitHub\Cataclysm-DDA>git fsck --name-objects
    Checking object directories: 100% (256/256), done.
    Checking objects: 100% (754983/754983), done.
    Checking connectivity: 755270, done.
    broken link from  commit 9654131119da18822995291049e541b15b5fb372 (refs/remotes/brambor-fork/gh-pages~7)
                  to  commit ff4619146ebcafe968cdfcd1689eec7740b68b43 (refs/remotes/brambor-fork/gh-pages~8)
    
  3. I tried and failed to fetch the commit from origin:

    C:\GitHub\Cataclysm-DDA>git fetch origin ff4619146ebcafe968cdfcd1689eec7740b68b43
    remote: Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
    fatal: bad object ff4619146ebcafe968cdfcd1689eec7740b68b43
    error: https://github.com/CleverRaven/Cataclysm-DDA.git did not send all necessary objects
    
  4. Since I didn't need the brambor-fork/gh-pages anyway, I simply deleted it

    C:\GitHub\Cataclysm-DDA>git push brambor-fork -d gh-pages
    To https://github.com/Brambor/Cataclysm-DDA.git
     - [deleted]               gh-pages
    
  5. Now git fetch brambor-fork works without any errors!

    • I got fatal: Unable to create 'C:/GitHub/Cataclysm-DDA/.git/objects/info/commit-graph.lock': File exists. but I simply deleted that file and all was fine™.
Hampstead answered 29/3 at 14:23 Comment(0)
H
-1

$ git push -f origin <last_good_commit>:<branch_name>

Hausa answered 3/3, 2023 at 21:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.