How to remove a too large file in a commit when my branch is ahead of master by 5 commits
Asked Answered
B

10

107

Context

I'm working alone on a project and I used github until now to save my work other than on my computer. Unfortunately, I added a very large file to the local repository : 300mb (which exceed Github's limit).

What I did

I will try to make an history of what I made :

  1. I (dumbly) added everything to the index :

     git add *
    
  2. I committed changes :

     git commit -m "Blablabla"
    
  3. I tried to push to origin master

     git push origin master 
    

It took a while, so I just CTRL+C, and repeated step 2 and 3 four times, until I realised that a file was too large to be pushed to github.

  1. I made the terrible mistake to delete my large file (I don't remember if I did a git rm or a simple rm)

  2. I followed the instructions on (https://help.github.com/articles/remove-sensitive-data)

  3. When I try to git filter branch, I get the following error : "Cannot rewrite branches: You have unstaged changes."

Brent answered 15/11, 2013 at 13:48 Comment(1)
possible duplicate of Update a development team with rewritten Git repo history, removing big filesIchinomiya
A
57

When you deleted your file, that will be a change and that is the unstaged change that git is complaining about. If you do a git status you should see the file listed as removed/deleted. To undo this change you should git checkout -- <filename>. Then the file will be back and your branch should be clean. You can also git reset --hard this will bring your repo back to the status where you made your commit.

I am assuming that it is the last commit that has the very large file that you want to remove. You can do a git reset HEAD~ Then you can redo the commit (not adding the large file). Then you should be able to git push without a problem.

Since the file is not in the last commit then you can do the final steps without a problem. You just need to get your changes either committed or removed.

http://git-scm.com/book/en/Git-Tools-Rewriting-History

Anent answered 15/11, 2013 at 14:27 Comment(5)
Thanks very much for your answer ! Problem is, I deleted the large file not in the latest commit ... So when I issue a git status I do not see my large file in the list ...Brent
The error that you are getting is because of modifications that are not commited so you will want to git reset --hard to get rid of these changes.Anent
Ok ! I guess I should backup the work I made within these 5 commits ? Do I just issue git reset --hardor do I use git reset --hard origin master ?Brent
You don't need to backup anything within the 5 commits. As long as you don't delete the repo that info is there. Your problem is that you have files listed as being changed, that is what git status is showing you. If you want to keep the files make another commit, if you don't need the changes then you can do git reset --hard this set the state of your repo to the latest commit that you have.Anent
Thanks ! It worked !!!! Thank you so much ! I will vote for your answer as soon as I have enough reputation :) *Brent
M
116

A simple solution I used:

  1. Do git reset HEAD^ for as many commits you want to undo, it will keep your changes and your actual state of your files, just flushing the commits of them.

  2. Once the commits are undone, you can then think about how to re-commit your files in a better way, e.g.: removing/ignoring the huge files and then adding what you want and then committing again. Or use Git LFS to track those huge files.


Edit: this answer is also acceptable if for instance your commits needed authentication (e.g.: username and email) and that you need to add the proper credentials after having commited. You can undo things the same way.

Question: would someone have a way to just cherrypick the commit that is bad and change it directly? I'm asking especially in the case of someone who would just need to re-authenthify his commits like in here, but in a case where the files needs not to be changed. Only commits to authentify.

Mou answered 29/9, 2016 at 10:54 Comment(3)
Nice - this is a very decent solution! For speed I recommend just adding large files to your .gitignore after a reset :)Celiaceliac
use ~n insted of ^ where n is the number of commits you are ahead incase it is more than one commit. Also it seems git has problems recognizing ^ depending on the localization, for me, it doesnt work in a french command prompt, so when its 2 commits ahead, i had to use ~2.Inaccessible
This is a cool answer. The one thing I didn't understand was the use of ^ versus ~. I found this post pretty helpful: #2222158 for anyone else who had similar questions.Embrasure
A
57

When you deleted your file, that will be a change and that is the unstaged change that git is complaining about. If you do a git status you should see the file listed as removed/deleted. To undo this change you should git checkout -- <filename>. Then the file will be back and your branch should be clean. You can also git reset --hard this will bring your repo back to the status where you made your commit.

I am assuming that it is the last commit that has the very large file that you want to remove. You can do a git reset HEAD~ Then you can redo the commit (not adding the large file). Then you should be able to git push without a problem.

Since the file is not in the last commit then you can do the final steps without a problem. You just need to get your changes either committed or removed.

http://git-scm.com/book/en/Git-Tools-Rewriting-History

Anent answered 15/11, 2013 at 14:27 Comment(5)
Thanks very much for your answer ! Problem is, I deleted the large file not in the latest commit ... So when I issue a git status I do not see my large file in the list ...Brent
The error that you are getting is because of modifications that are not commited so you will want to git reset --hard to get rid of these changes.Anent
Ok ! I guess I should backup the work I made within these 5 commits ? Do I just issue git reset --hardor do I use git reset --hard origin master ?Brent
You don't need to backup anything within the 5 commits. As long as you don't delete the repo that info is there. Your problem is that you have files listed as being changed, that is what git status is showing you. If you want to keep the files make another commit, if you don't need the changes then you can do git reset --hard this set the state of your repo to the latest commit that you have.Anent
Thanks ! It worked !!!! Thank you so much ! I will vote for your answer as soon as I have enough reputation :) *Brent
L
20

The github solution is pretty neat. I did a few commits before pushing, so it's harder to undo. Githubs solution is : Removing the file added in an older commit

If the large file was added in an earlier commit, you will need to remove it from your repository history. The quickest way to do this is with The BFG (a faster, simpler alternative to git-filter-branch):

bfg --strip-blobs-bigger-than 50M
# Git history will be cleaned - files in your latest commit will *not* be touched

https://help.github.com/articles/working-with-large-files/

https://rtyley.github.io/bfg-repo-cleaner/

Ladawnladd answered 10/3, 2015 at 9:1 Comment(0)
Q
9

This is in reference to the BFG post above, I would comment directly, but I have no idea how to do so as a low reputation new user.

You may want to do a 'git gc' to repack first.

I had issues getting BFG to work until I did so, this appears to be a common issue if you've only been working in a local repo and are prepping stuff to put up on a remote for the first time.

Relevant google hit which twigged me to it: https://github.com/rtyley/bfg-repo-cleaner/issues/65

Quarry answered 8/5, 2017 at 16:14 Comment(1)
I just had to do this (my .gitignore wasn't set up correctly) and bfg said this in the output for the original try: Warning : no large blobs matching criteria found in packfiles - does the repo need to be packed? That might be a new(er) thing since this question is quite old.Farlay
B
6

I continue to run into this problem over and over again, and I don't seem to learn not to do it. The solutions offered here have worked for me before, but for some reason not this time, but here is what did work (from https://medium.com/analytics-vidhya/tutorial-removing-large-files-from-git-78dbf4cf83a):

to remove the large file

git rm --cached <filename>

Then, to edit the commit

git commit --amend -C HEAD

Then you can push your amended commit with

git push
Blandishment answered 23/9, 2021 at 18:19 Comment(1)
I've found this whole too large file issue to be confounding. Thanks for the solution. For me, the large size occurs from output embedded in jupyter notebooks. First the output has to be cleared before your commands. Don't be like me and forget to do a git add <filename> for each reduced size file before the git push.Toting
P
2

It seems your only problem is having unstaged changes. You didn't give any detail as to what was actually out of sync, so it's a shot in the dark, but assuming you simple-rmd the file in step 4, you'd bring it back from the index with:

git checkout large_file

If not, you're on your own. Your goal is to make sure both your index and your working tree are in the same state. This shows as git status reporting nothing to commit, working directory clean.

The nuclear option to ensure a clean tree would be git reset --hard. If you want to try that, do backup your tree+repo beforehand.

Once your working copy is clean, you can proceed with your steps 5 and 6.

Prelature answered 15/11, 2013 at 14:26 Comment(1)
Thanks for your answer ! I rmd my large file and then committed 4 times ... The large file is not important to me. I just don't want it to upload to github.Brent
C
1

Here is what worked for me:

  1. Download and install BFG Repo-Cleaner (BFG), which is available here. My download was bfg-1.13.0.jar.
  2. A potentially helpful location to move the downloaded jar file, in my case bfg-1.13.0.jar, to is your ${JAVA_HOME}/lib. That is what I did because I want the Java specific libraries like these in a somewhat sensible location since they are not like ordinary Windows installations. You may wish to rename the jar file simply as bfg.jar to keep things simple - so below, where I use bfg.jar, I actually mean bfg-1.13.0.jar in my case.
  3. Run java -jar ${JAVA_HOME}/lib/bfg.jar --delete-files <file_name> --no-blob-protection .; you should replace the whole of <file_name> with the specific file name that is causing the issue - note that the path to the file is not necessary ONLY the file name by itself.
  4. Run git reflog expire --expire=now --all && git gc --prune=now --aggressive to complete the BFG cleaning job
  5. Finally, run git push origin main --force to complete pushing any outstanding local commits as you desire.
  6. If you have done everything up until this point successfully then your problem has been solved
  7. Going forward, always check that you do not inadvertently add very large files in directories to Git if you wish to avoid this problem reoccurring.
Cetology answered 16/10, 2020 at 10:28 Comment(0)
A
1

Copy newest Repo state

cp -r original_repo repo_tmp

Reset Original Repo to state before large file was commited

cd original_repo && git reset --hard {commit_before_large_file}

Remove .git from repo_tmp, so we only get the contents

cd .. && rm -rf repo_tmp/.git

Copy & Replace repo_tmp (newest repo state) to the original_repo folder

cp -r repo_tmp original_repo

Now Add, Commit & Push and you are good to go

git add . && git commit -m "be gone large file" && git push

Argentic answered 22/12, 2021 at 18:15 Comment(0)
T
0

If you need to remove a file from an earlier commit, you can do git rebase -i (interactive), and edit the commit you need to change.

OR, you can create a new commit with all the changes you want (ie. remove the large file), then start git rebase -i and reorder commits, so that your "repair" commit is directly after the one where you committed the large file. In the rebase script, replace pick with squash. That will merge two commits into one.

See Git - Rewriting history for details.

Twomey answered 17/3, 2023 at 11:23 Comment(0)
S
0

Best Way I can do is:

  1. git reset HEAD^
  2. Remove/change files you don't want to commit
  3. git commit again
  4. push to remote repo
Spiny answered 17/4 at 2:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.