Git/GitHub import from SourceAnywhere
Asked Answered
C

3

2

We currently use SourceAnywhere Hosted as our version control server. I'm looking to migrate over to GitHub, and would really like to preserve our 8+ year history.

Has anyone else successfully completed this migration and care to share their tools/process?

Now, assuming that this hasn't been done before, I suppose I'm looking at writing a git fast-import script using the SourceAnywhere SDK or command line client. Being new to git, are there any existing scripts or resources you could direct me towards as a starting point?

Chabazite answered 7/1, 2013 at 5:53 Comment(0)
C
2

I've finally cleaned up my project and added it to GitHub. You can find it here: SAWHtoGit.

It does a pretty good job of exporting the history into logical changesets, with a few small limitations:

  • Any file that has been deleted in SourceAnywhere will not be imported into the history, at all, due to limitations in the SourceAnywhere API.
  • Any files that were 'moved' can be imported, as long as you provide a mapping for the old/new directories.

Other than that, it worked well for our purposes and we were able to successfully migrate our code and history to GitHub. I hope it will be useful to others, as well!

Chabazite answered 5/2, 2013 at 23:41 Comment(1)
Good feedback, in addition to my answer. +1Chert
C
1

The import part is easy:
Once you have extracted a coherent set of files from your initial repo, you can add it to the git repo which will detects any modification/addition/removal.

"Coherent" = a set of files which represents a stable state, like for instance "which compiles": those points in time are usually represented by label, especially in a repo working at the file level like SAW (as opposed to git, which works at the repository level, each revision representing the content of the full repo)

Adding a set of file to git is as simple as:

git --work-tree=/path/to/extracted/file --git-dir=/path/to/git/repo/.git add -A
git --work-tree=/path/to/extracted/file --git-dir=/path/to/git/repo/.git commit -m "new revision from SAW import"

The difficulty is to determine what to import.
I would recommend listing all labels, and use them to get all the projects as in GetProject -label (using the SAW CLI)

Note that each project should be in its own Git repo: that will avoid large bloated repo, which will be hard to clone around, as opposed to a centralized model with SAW, where you can put all your projects in one referential.


The OP Dan comments:

I was able to use the SourceAnywhere COM SDK to write a small utility to extract my history (as best as the SDK would allow) and write out a fast-import script to load it all into git.
While not every intermediary changeset is necessarily "coherent", the end result matches our current state, and we preserved the bulk of our history.

Chert answered 7/1, 2013 at 8:57 Comment(5)
Thanks! Sounds like you've done this before :) We don't use labels, but I think GetProjectHistory -prj $/ -v will give me 'coherent' timestamps I can use for GetProject -time. I would like to preserve/map users and checkin descriptions, though, and it seems like this approach wouldn't allow that? The comment, while shown in the GUI, doesn't seem to come across with the CLI.Chabazite
@Chabazite Would this approach allow that? Not easily. Don't forget that git has no authentication (https://mcmap.net/q/21330/-distributed-version-control-systems-and-the-enterprise-a-good-mix-closed), so you attach to a commit (a revision of the full content of your project) one name (the committer name). If your GetProjectHistory -prj $/ -v shows you a state composed of multiple contribution, you won't be able to import it with multiple name attached to it. Remember: you import a all content, not some files. Git doesn't work file by file (as opposed to other older centralized VCS: https://mcmap.net/q/13426/-what-are-the-basic-clearcase-concepts-every-developer-should-know-closed).Chert
Thanks for your help! While I didn't use your solution exactly (GetProject -time didn't work as I thought), it pointed me in the right direction. I was able to use the SourceAnywhere COM SDK to write a small utility to extract my history (as best as the SDK would allow) and write out a fast-import script to load it all into git. While not every intermediary changeset is necessarily "coherent", the end result matches our current state, and we preserved the bulk of our history. I do plan to clean up my code and share it on GitHub in the future.Chabazite
@Dan: Excellent. I have included your comment in the answer for more visibility.Chert
Thanks! I finally posted my solution and added an answer with the details. Appreciate the pointers in the right direction!Chabazite
N
0

You can look at how other git-over-XYZ implementations work. For example, here's the code for git-svn, and here's the code for git-cvsimport (both in Perl).

Nottage answered 7/1, 2013 at 6:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.