Migrate from CVS to Git without losing history
Asked Answered
S

9

44

I need to know if there is a way to migrate my code from CVS source control to Git?

If yes, what about my history of commits?

Sedge answered 1/1, 2014 at 14:37 Comment(4)
here there is a good answer to your probleme :) https://mcmap.net/q/276616/-is-there-a-migration-tool-from-cvs-to-gitBasrhin
possible duplicate of Is there a migration tool from CVS to Git?Telegram
@kolen: Seems like it.Mulch
I couldn't find a good solution to this on Stack Overflow, but I did find a great one at superuser.com/a/1451527/120672 .Skilling
H
23

I've not personally done a conversion from CVS to Git, but I believe Eric Raymond's cvs-fast-export is the tool to use. He has the man page posted here. cvsps is another tool maintained by Eric, but it has recently been deprecated in favor of cvs-fast-export. cvs2git is another tool which is built on some of the same machinery as cvs2svn. The latter was extremely adept, and so I have high hopes that cvs2git is equally good.

One thing to note: CVS is a pretty broken RCS. It's possible that it can have content that can't be reflected exactly in Git. In other words, there is some impedance mismatch there, but the tools try very hard to preserve as much as possible. Make sure to check your conversion and that you're happy with the results. You may need to fixup part of the Git history to get something more acceptable, but I doubt you'll need to.

Hurtless answered 1/1, 2014 at 14:45 Comment(2)
cvs-fast-export in my opinion has one big flaw ... it uses C. That means it's harder to debug issues, specifically because it outputs stuff on stdout and reads from stdin. It also renumbers the CVS revisions from how they appear in the RCS file which can be confusing if you try to pinpoint issues. Still, this should be the accepted answer. +1Cuckooflower
I tried cvs2git at first, but it depends on anydbm package and a dbm engine other than the default. By the time you've installed all the required Python libraries and gotten it to run, you've done 10x the work of downloading and compiling cvs-fast-export, which doesn't need any external libraries. I found that cvs-fast-export did an excellent job of converting my CVS repositories.Urga
M
32

Here is the process I used to migrate a SourceForge CVS repo to Git using cvs2git (latest stable release is here, but IIRC I used the github dev version), which works on both Windows and Linux without any compilation required since it's just Python.

Also, you don't need to own the repo with this method, you can for example migrate SourceForge projects that you don't own (you just need the right to checkout, so this works on any public repo).

How to import from sourceforge CVS to git.
First, you need to download/checkout the cvs repo with the whole history (not just checkout the HEAD/Trunk):

rsync -av rsync://PROJECT.cvs.sourceforge.net/cvsroot/PROJECT/\* cvs

then use cvs2git (python script, works on all platforms, no compilation needed):

python cvs2git --blobfile="blob.dat" --dumpfile="dump.dat" --username="username_to_access_repo" --options=cvs2git.options --fallback-encoding utf-8 cvs

this should have generated two files blob and dump containing your whole cvs history. You can open them in a text editor to check that the content seems correct.

then initialize your git repo inside another folder:

mkdir gitexport/
cd gitexport
git init

then load up the exported cvs history onto git:

cat ../{blob,dump}.dat | git fast-import

and then place the git commit cursor at the end of history:

git reset --hard

finally and optionally, you can push to your remote git repository:

git push -u origin master

of course you need before to git remote add origin https://your_repo_url

Note: cvs2git.options is a JSON formatted configuration file for cvs2git where you can specify transforms for various things like author names (so that their nicknames will be automagically transformed to their full name after import). See the documentation here or the included example options file.

Marika answered 9/3, 2015 at 20:56 Comment(6)
After browsing and trying all the other solutions this one worked for me! :DCarbonate
it worked for me too, I had no option of rsync but I could manually copy the repository. remember always copy the projects manually to the folder or do rsync don't checkout.Rhine
On linux you can apt install cvs2svn which includes cvs2gitDiahann
How would I apply your first line, with rsync, when My CVS root is a "pserver"?Mulch
@Mulch I'm sorry but I'm not very much experienced with CVS so I can't answer your question, maybe someone else can? You can also try to play with the command until it works, or use TortoiseCVS that may ease the process.Marika
Even though this answer states that cvs2git works on all platforms, I was unable to get it to work on Windows, because apparently cvs2git uses the locally installed CVS client to perform a local checkout (:local:). The Windows CVSNT client doesn't support this and fails with cvs [checkout aborted]: Couldn't open default trigger library: No such file or directory. Fortunately, the CVS client in Cygwin does support local checkouts and running cvs2git in a Cygwin installation completed successfully.Ephor
H
23

I've not personally done a conversion from CVS to Git, but I believe Eric Raymond's cvs-fast-export is the tool to use. He has the man page posted here. cvsps is another tool maintained by Eric, but it has recently been deprecated in favor of cvs-fast-export. cvs2git is another tool which is built on some of the same machinery as cvs2svn. The latter was extremely adept, and so I have high hopes that cvs2git is equally good.

One thing to note: CVS is a pretty broken RCS. It's possible that it can have content that can't be reflected exactly in Git. In other words, there is some impedance mismatch there, but the tools try very hard to preserve as much as possible. Make sure to check your conversion and that you're happy with the results. You may need to fixup part of the Git history to get something more acceptable, but I doubt you'll need to.

Hurtless answered 1/1, 2014 at 14:45 Comment(2)
cvs-fast-export in my opinion has one big flaw ... it uses C. That means it's harder to debug issues, specifically because it outputs stuff on stdout and reads from stdin. It also renumbers the CVS revisions from how they appear in the RCS file which can be confusing if you try to pinpoint issues. Still, this should be the accepted answer. +1Cuckooflower
I tried cvs2git at first, but it depends on anydbm package and a dbm engine other than the default. By the time you've installed all the required Python libraries and gotten it to run, you've done 10x the work of downloading and compiling cvs-fast-export, which doesn't need any external libraries. I found that cvs-fast-export did an excellent job of converting my CVS repositories.Urga
G
12

You can use git-cvsimport to import your CVS repository into Git. By default, this will check out every revision, giving you a relatively complete history.

Depending on your operating system, you may need to install support for this separately. For example, on an Ubuntu machine you would need the git-cvs package.

This answer goes into more detail.

Grotius answered 1/1, 2014 at 14:45 Comment(2)
git-cvsimport is more resilient w.r.t. issues in the CVS history, but that yields (silently) incoherent results. I have been reading a lot of answers on SO concerning similar questions and wonder how many people did actual real-world conversions with repositories that have real "scars" and issues.Cuckooflower
In my case: cvs-fast-export imported no history, cvs2git not even imported a file (error on the example option file), git-cvsimport instead did the work with a syntax very close to This answer link abovePurveyor
H
4

I've used recently (2016) reposurgeon of Eric Raymond to import a CVS repo from sourceforge to git. I was very pleasantly surprised and it worked very well. After past experiences with cvs2svn and other tools, I recommend without hesitation reposurgeon for this kind of tasks.

Eric has incorporated the migration procedure into Reposurgeon documentation under the title "A Guide to Repository Conversion"

Hight answered 4/5, 2017 at 15:3 Comment(0)
O
2

I used Docker to run cvs2git, using the excellent steps above by @gaborous, and based on the https://github.com/mhagger/cvs2svn Dockerfile code for cvs2svn. This has the advantage of having all required tools installed in the image, ready to run.

Follow @gaborous steps above, but replace the python execution with the Docker run.

  1. Clone https://github.com/mhagger/cvs2svn to local directory which we will call $DIR.

    cd $DIR

  2. Edit Dockerfile

Copy Dockerfile to Dockerfile-cvs2git and edit.

ENTRYPOINT ["cvs2git"]

Build the docker image named cvs2git:

docker build --target=run --tag cvs2git . -f Dockerfile-cvs2git
  1. Copy cvs2git-example.options to cvs2git.options.

cvs2git.options provides two things:

  • blob and dump filenames and locations
  • CVS sub-module name

Edit cvs2git.options

..  
blob_filename=r'/tmp/git-blob.dat',  
..  
dump_filename=r'/tmp/git-dump.dat',  
..  
run_options.set_project(  
  r'/cvs/<my-sub-repo>',  
..
  1. Run the Docker image.

Specify Docker volumes (-v) for the CVS root repo location and tmp for the output files location. Note that cvs2git.options provides the run's configuration.

  docker run -it --rm  -v /opt/CVS/<root-repo>/:/cvs -v /opt/tmp:/tmp \
    cvs2git \
    --options=cvs2git.options \
    --fallback-encoding utf-8
  1. Follow @gaborous instructions above starting with

    mkdir gitexport/
    cd gitexport
    git init

    cat /opt/tmp/{blob,dump}.dat | git fast-import

Outrush answered 12/5, 2022 at 19:12 Comment(0)
D
1

In order to clone a project from sourceforge to github I performed the following steps.

PROJECT=some_sourceforge_project_name
GITUSER=rubo77
rsync -av rsync://a.cvs.sourceforge.net/cvsroot/$PROJECT/\* cvs
svn export --username=guest http://cvs2svn.tigris.org/svn/cvs2svn/trunk cvs2svn-trunk
cp ./cvs2svn-trunk/cvs2git-example.options ./cvs2git.options
vim cvs2git.options # edit run_options.set_project
cvs2svn-trunk/cvs2git --options=cvs2git.options --fallback-encoding utf-8

create an empty git at https://github.com/$GITUSER/$PROJECT.git

git clone [email protected]:$GITUSER/$PROJECT.git $PROJECT-github
cd $PROJECT-github
cat ../cvs2git-tmp/git-{blob,dump}.dat | git fast-import
git log
git reset --hard
git push
Diahann answered 2/11, 2018 at 17:3 Comment(0)
H
1

gaborous's answer uses git fast-import, which could fails on log message not encoded in UTF-8.

That will work better with Git 2.23 (Q2 2019): The "git fast-export/import" pair has been taught to handle commits with log messages in encoding other than UTF-8 better.

See commit e80001f, commit 57a8be2, commit ccbfc96, commit 3edfcc6, commit 32615ce (14 May 2019) by Elijah Newren (newren).
(Merged by Junio C Hamano -- gitster -- in commit 66dc7b6, 13 Jun 2019)

fast-export: do automatic reencoding of commit messages only if requested

Automatic re-encoding of commit messages (and dropping of the encoding header) hurts attempts to do reversible history rewrites (e.g. sha1sum <-> sha256sum transitions, some subtree rewrites), and seems inconsistent with the general principle followed elsewhere in fast-export of requiring explicit user requests to modify the output (e.g. --signed-tags=strip, --tag-of-filtered-object=rewrite).
Add a --reencode flag that the user can use to specify, and like other fast-export flags, default it to 'abort'.

That means the Documentation/git-fast-export now includes:

 --reencode=(yes|no|abort)::

Specify how to handle encoding header in commit objects.

  • When asking to 'abort' (which is the default), this program will die when encountering such a commit object.
  • With 'yes', the commit message will be reencoded into UTF-8.
  • With 'no', the original encoding will be preserved.

fast-export: avoid stripping encoding header if we cannot reencode

When fast-export encounters a commit with an 'encoding' header, it tries to reencode in UTF-8 and then drops the encoding header.
However, if it fails to reencode in UTF-8 because e.g. one of the characters in the commit message was invalid in the old encoding, then we need to retain the original encoding or otherwise we lose information needed to understand all the other (valid) characters in the original commit message.

fast-import: support 'encoding' commit header

Since git supports commit messages with an encoding other than UTF-8, allow fast-import to import such commits.
This may be useful for folks who do not want to reencode commit messages from an external system, and may also be useful to achieve reversible history rewrites (e.g. sha1sum <-> sha256sum transitions or subtree work) with Git repositories that have used specialized encodings in their commit history.

The Documentation/git-fast-import now includes:

encoding`

The optional encoding command indicates the encoding of the commit message.
Most commits are UTF-8 and the encoding is omitted, but this allows importing commit messages into git without first reencoding them.


To see that test which uses an author with non-ascii characters in the name, but no special commit message.
It does check that the reencoding into UTF-8 worked, by checking its size:

The commit object, if not re-encoded, would be 240 bytes.

  • Removing the "encoding iso-8859-7\n" header drops 20 bytes.
  • Re-encoding the Pi character π from \xF0 (\360) in iso-8859-7 to \xCF\x80 (\317\200) in UTF-8 adds a byte.

Check for the expected size.


And with Git 2.29 (Q4 2020), the pack header created for import is better managed.

See commit 7744a5d, commit 014f144, commit ccb181d (06 Sep 2020) by René Scharfe (rscharfe).
(Merged by Junio C Hamano -- gitster -- in commit 9b80744, 18 Sep 2020)

fast-import: use write_pack_header()

Signed-off-by: René Scharfe

Call write_pack_header() to hash and write a pack header instead of open-coding this function.
This gets rid of duplicate code and of the magic version number 2 -- which has been used here since c90be46abd ("Changed fast-import's pack header creation to use pack.h", 2006-08-16, Git v1.5.0-rc4 -- merge) and in pack.h (again) since 29f049a0c2 (Revert "move pack creation to version 3", 2006-10-14, Git v1.4.3).

Heaven answered 14/6, 2019 at 19:45 Comment(0)
H
0

Migration from CVS to Git using cvs2svn

Sharing all step for migration CVS to git

1. create directory a cvsProject in anyDir
Rsync: your cvs repo:  
 1. $rsync -av  CVSUserName@CVSipAdrress:/CVS_Path/ProjectName/*  ~/anyDir/ProjectName

2. cd $../cvs2svn-x.x.0 && ./cvs2git --options=cvs2git-example.options
3. $./cvs2git --blobfile=cvs2git-tmp/git-blob.dat \ --dumpfile=cvs2git-tmp/git-dump.dat \ --username=CVS_YOUR_USER_NAME \ /path_of_step(1)/cvsProject
Note: if get any encoding error then add this into above command:"--encoding=ascii --encoding=utf8 --encoding=utf16 --encoding=latin"
4. mkdir newGitRepo && cd newGitRepo 5. git init --bare 6. git fast-import --export-marks=/x.x.x/cvs2svn-2.5.0/cvs2git-tmp/git-marks.dat \

wow now you are done, now you can push your repo to git..

Referenece : [link1][2] ,[link2][2]
Hydroxyl answered 14/11, 2019 at 6:50 Comment(1)
let me know if you face any issueHydroxyl
M
0

I've recently had success and a relatively pleasing experience using the "CVS Remote Access Program", or, well, crap (GitHub).

It can apparently handle various intricacies of CVS repositories which none/not all of the other conversion tools can, but I'm not well-versed in the details. Like cvs2git, it also follows the path of dump files which are actuall imported into git using git-fast-import.

The reason I'm suggesting it is that when I found a deficiency in it, I was able to add the capability I was missing to the existing code - and it wasn't so terrible. My PR for that is pending as are a bunch of bug reports.

Mulch answered 1/8, 2020 at 22:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.