How can I split a single file from a git repo into a new repo?
Asked Answered
R

4

16

I have a git repo with several directories, and a single file, MyFile.ext.

/
  LargeDir1/
  LargeDir2/
  LargeDir3/
      .
      .
      .
  MyFile.ext

I'd like to start a new repo with just MyFile.ext in it, and keep all the history pertaining to it, but ignore everything else (all the LargeDirs). How can I do this?

For directories, I've successfully used this answer, but I tried that on a single file, and it doesn't work.

I've also tried this answer, which does delete everything except my file, but it also seems to leave all the history around.

Rockabilly answered 13/9, 2016 at 21:20 Comment(6)
see if you can 'git mv' the file into a sub directory, then use 'git subtree'.Malita
@Malita I will try it, but I'm pretty sure it won't work because when you move a file, you lose the git history. I've had issues with that before when using git subtree split on a renamed directory.Rockabilly
@Malita Yeah just tried it. The only commit that comes into the new repo is the commit where the file was moved to the new directory.Rockabilly
I have this exact same problem and have tried both git subtree split ... and git filter-branch ... solutions without success. Those basically only work for subdirectories where everything in it was never altered outside that directory. What I want is a commit & log history that is what you see when you run git log MyFile.ext.Kalgoorlie
@Kalgoorlie Yeah, I don't know if it's possible. I eventually gave up and lost the history.Rockabilly
Actually, I just figured out a (rather labrious but working) solution. I was just compiling a set of steps for the solution, but it's based on the accepted solution to: stackoverflow.com/questions/16930919/…Kalgoorlie
M
24

Use git fast-export.

First you export the history of the file to a fast-import stream. Make sure you do this on the master branch.

cd oldrepo
git fast-export HEAD -- MyFile.ext >../myfile.fi

Then you create a new repo and import.

cd ..
mkdir newrepo
cd newrepo
git init
git fast-import <../myfile.fi
git checkout
Midweek answered 16/3, 2017 at 22:14 Comment(11)
I just tried this, but the file doesn't appear in the new repo.Rockabilly
@KrisHarper fast-import creates the commits, but not a working copy. See updated answer.Midweek
I get "fatal: You are on a branch yet to be born".Rockabilly
What version of git are you using? This works fine with git 2.11. One some older versions of git, you might have to create an initial commit first to create the master branch.Midweek
Git version 2.7.4Rockabilly
What branch were you on when you ran fast-export? In my tests I was on master, and I can confirm this works.Midweek
Ah, you are correct. I was on a branch other than master.Rockabilly
Excellent. Much simpler than what I figured out, and cleaner. Wish you'd answered this 2 days ago before I spent time trying to roll my own solution.Kalgoorlie
@Kalgoorlie It was actually your answer that made me see this question. :-) When I originally ran into this problem in 2015, the first thing that I've tried was reposurgeon. That pointed me to fast-import streams, and that lead me to git fast-export I'm not sure where I found this exact usage; it's not in the manual page. The point here is that it's not blindingly obvious that this is a good way to do such things.Midweek
For future reference: one can add multiple filenames or even directory names instead of MyFile.ext, very useful for splitting a repo in several pieces.Paternity
@KrisHarper You have to git checkout <the-branch-name-you-were-on-in-oldrepo>.Offering
K
0

I had this same issue and I finally figured it out. I had an old old directory of scripts - so old, they had originally been under RCS control. Years ago, I made it into a git repo (without really knowing what I was doing) and I converted the RCS log and update the git log. But I picked up development of one of the scripts and decided it needed its own repo. The various solutions out there (subtree and filter-branch) depend on the part you're splitting out to be a directory. You can put the file in a directory and split it out that way, but you don't get the revision history with it. So here's how I figured out how to extract the revision history of a single file and create a new repo with it:

  1. Create a branch new repo [I did it at the same level as the source-repo]

    git init <new-repo>
    
  2. Now go into your source repo and create a file that we're going to use later to cherry-pick the file's commits:

    cd <source-repo>
    git log --reverse <target-file.ext> | \
        grep ^commit | cut -d ' ' -f 2 | cut -c 1-7 | \
        perl -ne 'print("pick $_")' > ../commits-to-keep.txt
    
  3. Create a temporary branch and push it to your new repo (then delete it)

    git checkout -b tmpbranch
    git push ../new-repo tmpbranch
    git checkout master
    git branch -d tmpbranch
    
  4. Now go to your new repo and create an empty commit off of which we will rebase:

     cd ../<new-repo>
     git commit --allow-empty -m 'root commit'
     git rebase --onto master --root tmpbranch -i
    
  5. [The only manual step] In the editor that comes up from the last command above, remove all the contents and paste in the contents of the file you created earlier: ../commits-to-keep.txt

  6. Now you can switch back to the master branch, merge, and then clean up the temporary branch:

     git checkout master
     git merge tmpbranch
     git branch -d  tmpbranch
    

The only drawback here is that you end up with the extra empty root commit. I found that there are ways to remove it, but for my purposes, this was good enough.

Kalgoorlie answered 16/3, 2017 at 21:33 Comment(1)
Even though @RolandSmith's answer is the way to go, I thought I'd add that my manual step can be changed to be automatic by inserting this above the rebase command: setenv GIT_EDITOR 'vim +"%d | r ../commits-to-keep.txt | wq"'. This would make it possible to skip step 5. Though you might not want to lose what already might be in GIT_EDITOR. In lieu of @RolandSmith's answer, I thought about taking my answer down, but I figured I'd leave it up just to show another way to sort of do it.Kalgoorlie
V
0
  1. Clone the repo.
  2. Filter out everything but that one file.

Cloning can be done normally with git clone. That will work fine on a directory like git clone /path/to/the/repo. Then remove remote pointing back to the clone.

git clone /path/to/the/repo
git remote rm origin

Then use git filter-branch to filter out everything but that one file. This is easiest to accomplish with an index filter that deletes all files and then restores just the one.

git rm --cached -qr -- . && git reset -q $GIT_COMMIT -- YOURFILENAME

An index filter works by checking out each individual commit with all the changes staged. You're running this command, and then recommitting it. It first removes all the changes from staging, then restores that one file to its state in that commit. $GIT_COMMIT is the commit being rewritten. YOURFILENAME is the file you want to keep.

If you're doing all branches and tags with --all, add a tag filter which ensures the tags are rewritten. That's as simple as --tag-name-filter cat. It will not change the content of the tags, but it will ensure they're moved to the rewritten commits.

Finally, you'll want --prune-empty to remove any now empty commits that didn't involve that file. There will be a lot of them.

Here it is all together.

git filter-branch \
    --index-filter 'git rm --cached -qr -- . && git reset -q $GIT_COMMIT -- YOURFILENAME' \
    --tag-name-filter cat
    --prune-empty \
    -- --all
Viera answered 16/3, 2017 at 23:47 Comment(0)
L
0

Git now recommends using git filter-repo instead (you get a message about it when using filter-branch). Another answer on one of the questions you linked has a long explanation, but here's a short example.

To remove everything except src/README.md and move it to the root:

pip install git-filter-repo
# Must use a fresh clone to avoid losing local history.
git clone --no-local project extracted
cd extracted/
git filter-repo --path src/README.md
git filter-repo --subdirectory-filter src/

We use --path selects the single file and --subdirectory-filter moves the contents of that directory to root. I can't find a way to do this in a single pass, but the second pass is much faster since the first eliminates most of the history.

Lissa answered 28/8, 2022 at 18:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.