Converting big bzr repository to git, what to expect?
Asked Answered
T

1

13

I'm trying to convert some old Bazaar repositories to git, and while everything seem to go through smoothly, I'm a bit unsure if it really went as well as it claimed.

My Bazaar repository is structured like so:

  • repo
    • trunk
    • prod
    • feature/feature-branchX
    • feature/feature-branchY

I'm using the fast-export/fast-import method for migrating between bzr and git.

Initially, I migrate the "trunk", with --export-marks, like so:

bzr fast-export --export-marks=../$1/marks.bzr ../$1/trunk | git fast-import --export-marks=../$1/marks.bzr --export-marks=../$1/marks.git

With $1 being the name of the

then iterate all other folders in the "repo" directory and call:

bzr fast-export --marks=../$1/marks.bzr  --git-branch=$nick ../$1/$b/.. | git fast-import --import-marks=../$1/marks.git --export-marks=../$1/marks.git

with $nick being the branch nickname of bzr, and $1/$b being the directory name of the branch.

As I said, it processes all the expected directories, but after completion, when I do a:

git branch

It shows just 20 something branches, where the original Bazaar repository had 80+.

Now, just looking at "master" in git, it seems to be all there, and the missing 60 branches could easily be branches who are already merged into trunk. But I'm not really sure the fast-export/fast-import tools are clever enough to say "bah - you won't need this", but maybe they are.

Does anyone have any experience with this?

Am I only supposed to be left with "master" and any branch who has unmerged commits in them after migrating from bzr to git?

Finally, for the sake of history, is there any way to force all branches to be converted over, even if they are technically defunct?

Tuberose answered 12/11, 2013 at 13:44 Comment(2)
Do you observe the same result with a script like gist.github.com/bloveridge/624941?Homager
Maybe I'm not reading it right, but as I see it, that script also only expects to work on one branch. I.e. you can't pass the "repository" directory to it, only a checked out branch. Now, it may be that through some interaction I don't understand it actually checks out and reads all the other branches, but as I read it, it really doesn't. It makes sense too, if you prepare a project for migration, you close off everything else and migrate "trunk", but I can't stop all projects to get that done.Tuberose
M
15

It seems the fast-import/export tools are indeed clever enough to say "bah - you won't need this". It's not rocket science though, just like git branch -d knows when it's safe to delete a branch, so can git fast-import know that the incoming branch is a replica.

But probably you'd like to be really sure, and I agree. I put together a simple (if inefficient) script to find the list of unique bzr branches:

#!/bin/sh

paths=$(bzr branches -R)

for path1 in $paths; do
    merged=
    for path2 in $paths; do
        test $path1 = $path2 && continue
        # is path1 part of path2 ?
        if bzr missing -d $path1 $path2 --mine >/dev/null; then
            # is path2 part of path1 ?
            if bzr missing -d $path1 $path2 --other >/dev/null; then
                echo "# $path1 == $path2"
            else
                merged=1
                break
            fi
        fi
    done
    test "$merged" || echo $path1
done

Run this inside a Bazaar shared repository. It finds all branches, and then compares all branches against all other. If A is in B, then there are two possibilities: maybe B is also A, which means A == B. Otherwise A is really redundant.

The script filters out branches that are fully merged into at least one other branch. However, if there are multiple branches that are identical, it prints all of those, with additional lines starting with # to indicate that they are identical.

Your example commands with the bzr fast-export ... | git fast-import ... seem to have some unnecessary options. Following the examples at the very end of bzr fast-export -h, I recommend to use these steps instead:

  1. Create a brand new Git repo:

    git init /tmp/gitrepo
    
  2. Go inside your Bazaar shared repo:

    cd /path/to/bzr/shared/repo
    
  3. Migrate your main branch (trunk?) to be the master:

    bzr fast-export --export-marks=marks.bzr trunk/ | \
      GIT_DIR=/tmp/gitrepo/.git/ git fast-import --export-marks=marks.git
    
  4. Migrate all branches:

    bzr branches -R | while read path; do
        nick=$(basename $path)
        echo migrating $nick ...
        bzr fast-export --import-marks=marks.bzr -b $nick $path | \
          GIT_DIR=/tmp/gitrepo/.git git fast-import --import-marks=marks.git \
          &>/tmp/migration.log
    done
    

If you notice the last step does not check for trunk which you already migrated. It doesn't matter, as it won't import it again anyway. Also note that even if branchA is fully merged into branchB, it will be created in Git if it is seen first. If branchB is seen first, then branchA won't be created in Git ("bah - you won't need this").

I could not find a way to force creating identical branches when importing to Git. I don't think it's possible.

Mesozoic answered 15/11, 2013 at 21:11 Comment(6)
Thanks, sounds promising. Will try it out. The extra parameters to fast-export / fast-import was something I found somewhere else. Apparently, you must --export-marks from your "main" bzr branch, and then reuse those marks in all subsequent exports to ensure the fast-import simply doesn't create new commits out of what is really the same blob. or something. Does that make sense?Tuberose
@Tuberose Yes, using the marks is definitely a good idea, but you were not using them as written in the docs. I amended my answer with steps following the docs more closely.Mesozoic
I'm getting this error on 4th step: migrating . ... bzr: ERROR: unknown command "fast-export" How to fix it?Cystotomy
Issue fixed after sudo apt-get install bzr-fastimport.Cystotomy
A way to preserve all branches while using this answer is to tag them all before the conversion. for d in $(bzr branches); do cd $d; bzr tag b_$d; cd -; done This makes a tag prefixed with "b_" for each branch. After the conversion, you can turn those tags into git branches with while read tag_name; do git branch ${tag_name:2} $tag_name; done < <(git tag -l | grep ^b_). When recreating the branches from the tags, the ones that already exist from the conversion will just show an error (harmlessly) that they already exist.Cotenant
If your bzr branches are organized in subdirectories of your bzr repo, in step 4 you might want to change nick=$(basename $path) to nick=$path to maintain that structure for the git branches.Cotenant

© 2022 - 2024 — McMap. All rights reserved.