Remove a file from git history using git-filter-repo on a fresh clone
Asked Answered
E

2

19

I'm following this answer to remove a single file containing credentials from git history. I have git 2.35.1 and filter-repo 22826b5a68b6. The command I need is apparently:

git-filter-repo --path auth.json --invert-paths

If I try to apply this to my working repo, I get this error:

Aborting: Refusing to destructively overwrite repo history since
this does not look like a fresh clone.
(expected freshly packed repo)

So I check out a fresh copy with git clone, and the command runs successfully:

Parsed 861 commits
New history written in 0.69 seconds; now repacking/cleaning...
Repacking your repo and cleaning out old unneeded objects
HEAD is now at 7212384 Update app.css
Enumerating objects: 8203, done.
Counting objects: 100% (8203/8203), done.
Delta compression using up to 24 threads
Compressing objects: 100% (2310/2310), done.
Writing objects: 100% (8203/8203), done.
Total 8203 (delta 5630), reused 8196 (delta 5623), pack-reused 0
Completely finished after 2.85 seconds.

And I can see that the file has been removed. But when I go to push:

git push --force
fatal: No configured push destination.

for some reason it's lost the remote it cloned from, so I add it back in manually:

git remote add origin [email protected]:abc/xyz.git

This fails with:

fatal: The current branch master has no upstream branch.

so I add that with

git push --set-upstream origin master

but this fails too:

To git.example.com:abc/xyz.git
 ! [rejected]        master -> master (fetch first)
error: failed to push some refs to 'git.example.com:abc/xyz.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

but I know that nothing has been pushed to this repo since checking it out. Repeating the process has the same result. If I do a git pull to update it, it then fails again with the this does not look like a fresh clone error, right back where I started.

I went round and around this a few times, eventually getting past all the errors, only to find that it has all made no difference to my repo – the file is still there.

So my question is what are the exact steps I should do to make this filtering process work on a freshly cloned repo?

Etom answered 22/3, 2022 at 18:33 Comment(5)
Did you use push with --force after setting upstream?Heti
You have rewritten the history of your branches, you need to force push at the end : git push --force origin masterChiton
also : from this github issue, you probably just need to run git gc in your initial repo to make git filter-repo happyChiton
for some reason it's lost the remote it cloned from - filter-repo does this on purpose, on the theory that you shouldn't overwrite the original repository, but rather make a new one. If you dislike that theory, you just do what you did (plus TTT's answer).Altissimo
Idk how to use this git filter-repo, but it seems like not actually removing the File, but only the Committed History? \ I have to manually remove the file first, then use this to remove the committed history (-- so, yes, providing a path to a already removed file..). \ Have no idea I did it right or not. The history seems gone now. \ Though if I kept a link to the history before its removed, I can still access the file in a "removed history"?...Pochard
B
17

You were so close...

The reason you need to git push --force in the previous step is because you are going to blow away commits on the remote and replace them with your new ones. Since your remote is gone, skip the first force push command, and then you simply need to add force to your final push command:

git push --set-upstream origin master --force

Side Note: I almost always prefer using --force-with-lease over --force, as it's slightly safer in that it will error if someone added new commits to the remote branch between the time when you last fetched (or in this case, cloned) and pushed, that you haven't seen yet. It might be rude to just blow them away. When using --force-with-lease if you get the error, just do git fetch, look at the new commits and decide if you're OK with deleting them. If you are, then use --force-with-lease again and it will work (unless new commits appeared again in the last minute since you fetched).

In this particular case where you are re-adding your remote, you must fetch first or else the --force-with-lease will not work, and if it were me I would probably consider doing this if there was a possibility of new commits appearing on the remote between the time you cloned and when you are about to force push your rewritten repo. In that case I would change your final command to these steps:

git fetch
# inspect origin/master to see if new commits appeared after your clone
git push --set-upstream origin master --force-with-lease

Or, perhaps in your case, as soon as you decide you're going to rewrite a branch, temporarily lock the branch (or remove permissions to it), and unlock it after your force push. Then you know for sure no one is adding commits until you're done.

Because answered 23/3, 2022 at 4:57 Comment(6)
Unfortunately this doesn't work either. Immediately after cloning, git remote -v shows no remotes at all, but git remote add origin... fails saying remote origin already exists. Later when I try to force push to it with set-upstream, I get fatal 'origin' does not appear to be a git repository. None of this makes sense to me!Etom
Also this specific repo is working fine in other contexts. FWIW it's served from gitlab 14.8.Etom
Force-with-lease won't work after filter-repo because origin/master doesn't exist. (The filtered repository is a brand-new one, created with git init and filled in using git fast-import.)Altissimo
@Altissimo I'm pretty sure --force-with-lease still works with new branches. (Tested, but outside of using filter-repo.) I'm not sure what the error would be? And I suppose if it errors, git fetch first then try again would still fix it?Because
The lease option requires knowing the old commit hash ID. Normally this comes from the origin/foo remote-tracking name. You can specify it at git push time, --force-with-lease=<name>:<hash>, but I don't know of anyone who does that. Of course you can also run git fetch to create origin/foo first.Altissimo
@Altissimo ah...I misread your statement. The branch does exist, but the tracking branch doesn't. Great point.Because
E
6

TTT's answer helped, in particular the comment about filter-repo doing a git init – it was the order of operations that was the problem. I did this so many times before it worked, I turned it into a script to make it clear exactly what's needed and in what order:

#!/usr/bin/env bash
set -xv
git clone [email protected]:abc/xyz.git project
cd project
git filter-repo --path auth.json --invert-paths
git remote add origin [email protected]:abc/xyz.git
git push --set-upstream origin main --force

After doing this, I ran into lots of issues updating existing clones, but generally they were solved by accepting all changes from the remote.

Etom answered 23/3, 2022 at 15:10 Comment(1)
Hey, so I kind of had the same issues as you, and followed the steps. After I push everything to the newly set remote, if I check my remote repo, I can still see this history of the file. There's nothing to push, 'Everything up to date'. Is that normal?Wilhite

© 2022 - 2024 — McMap. All rights reserved.