How to backup a local Git repository?
Asked Answered
M

8

171

I am using git on a relatively small project and I find that zipping the .git directory's contents might be a fine way to back up the project. But this is kind of weird because, when I restore, the first thing I need to do is git reset --hard.

Are there any problems with backing up a git repo this way? Also, is there any better way to do it (e.g., a portable git format or something similar?)?

Macey answered 24/1, 2010 at 22:40 Comment(3)
Why did nobody give the obvious answer of using git bundle???Bloom
@Bloom they did. Scroll down.Macey
All upvoted answers contain a wall of text about custom scripts, even the one that starts mentioning git bundleBloom
S
24

I started hacking away a bit on Yar's script and the result is on github, including man pages and install script:

https://github.com/najamelan/git-backup

Installation:

git clone "https://github.com/najamelan/git-backup.git"
cd git-backup
sudo ./install.sh

Welcoming all suggestions and pull request on github.

#!/usr/bin/env ruby
#
# For documentation please sea man git-backup(1)
#
# TODO:
# - make it a class rather than a function
# - check the standard format of git warnings to be conform
# - do better checking for git repo than calling git status
# - if multiple entries found in config file, specify which file
# - make it work with submodules
# - propose to make backup directory if it does not exists
# - depth feature in git config (eg. only keep 3 backups for a repo - like rotate...)
# - TESTING



# allow calling from other scripts
def git_backup


# constants:
git_dir_name    = '.git'          # just to avoid magic "strings"
filename_suffix = ".git.bundle"   # will be added to the filename of the created backup


# Test if we are inside a git repo
`git status 2>&1`

if $?.exitstatus != 0

   puts 'fatal: Not a git repository: .git or at least cannot get zero exit status from "git status"'
   exit 2


else # git status success

   until        File::directory?( Dir.pwd + '/' + git_dir_name )             \
            or  File::directory?( Dir.pwd                      ) == '/'


         Dir.chdir( '..' )
   end


   unless File::directory?( Dir.pwd + '/.git' )

      raise( 'fatal: Directory still not a git repo: ' + Dir.pwd )

   end

end


# git-config --get of version 1.7.10 does:
#
# if the key does not exist git config exits with 1
# if the key exists twice in the same file   with 2
# if the key exists exactly once             with 0
#
# if the key does not exist       , an empty string is send to stdin
# if the key exists multiple times, the last value  is send to stdin
# if exaclty one key is found once, it's value      is send to stdin
#


# get the setting for the backup directory
# ----------------------------------------

directory = `git config --get backup.directory`


# git config adds a newline, so remove it
directory.chomp!


# check exit status of git config
case $?.exitstatus

   when 1 : directory = Dir.pwd[ /(.+)\/[^\/]+/, 1]

            puts 'Warning: Could not find backup.directory in your git config file. Please set it. See "man git config" for more details on git configuration files. Defaulting to the same directroy your git repo is in: ' + directory

   when 2 : puts 'Warning: Multiple entries of backup.directory found in your git config file. Will use the last one: ' + directory

   else     unless $?.exitstatus == 0 then raise( 'fatal: unknown exit status from git-config: ' + $?.exitstatus ) end

end


# verify directory exists
unless File::directory?( directory )

   raise( 'fatal: backup directory does not exists: ' + directory )

end


# The date and time prefix
# ------------------------

prefix           = ''
prefix_date      = Time.now.strftime( '%F'       ) + ' - ' # %F = YYYY-MM-DD
prefix_time      = Time.now.strftime( '%H:%M:%S' ) + ' - '
add_date_default = true
add_time_default = false

prefix += prefix_date if git_config_bool( 'backup.prefix-date', add_date_default )
prefix += prefix_time if git_config_bool( 'backup.prefix-time', add_time_default )



# default bundle name is the name of the repo
bundle_name = Dir.pwd.split('/').last

# set the name of the file to the first command line argument if given
bundle_name = ARGV[0] if( ARGV[0] )


bundle_name = File::join( directory, prefix + bundle_name + filename_suffix )


puts "Backing up to bundle #{bundle_name.inspect}"


# git bundle will print it's own error messages if it fails
`git bundle create #{bundle_name.inspect} --all --remotes`


end # def git_backup



# helper function to call git config to retrieve a boolean setting
def git_config_bool( option, default_value )

   # get the setting for the prefix-time from git config
   config_value = `git config --get #{option.inspect}`

   # check exit status of git config
   case $?.exitstatus

      # when not set take default
      when 1 : return default_value

      when 0 : return true unless config_value =~ /(false|no|0)/i

      when 2 : puts 'Warning: Multiple entries of #{option.inspect} found in your git config file. Will use the last one: ' + config_value
               return true unless config_value =~ /(false|no|0)/i

      else     raise( 'fatal: unknown exit status from git-config: ' + $?.exitstatus )

   end
end

# function needs to be called if we are not included in another script
git_backup if __FILE__ == $0
Serving answered 5/5, 2012 at 3:45 Comment(3)
@Yar Great bundle script, based on the git bundle I advocated for in my answer below. +1.Diazotize
I already installed your app in my local bare repository....how do you use it once it's installed....there's no info regarding that on the documentation, you should include a section withg an example on how to make a backupHippodrome
Hi, sorry you don't get it to work. Normally you run sudo install.sh, then configure it (it uses git config system) to set the destination directory (see the readme file on github). Next you run git backup inside your repository. As a sidenote, this was an experiment with git bundle and a response to this question, but git bundle never makes an absolute exact copy (eg. if i recall well, especially concerning git remotes), so personally I actually use tar to backup .git directories.Serving
D
161

The other official way would be using git bundle

That will create a file that supports git fetch and git pull to update your second repo.
Useful for incremental backup and restore.

But if you need to backup everything (because you do not have a second repo with some older content already in place), the backup is a bit more elaborate to do, as mentioned in my other answer, after Kent Fredric's comment:

$ git bundle create /tmp/foo master
$ git bundle create /tmp/foo-all --all
$ git bundle list-heads /tmp/foo
$ git bundle list-heads /tmp/foo-all

(It is an atomic operation, as opposed to making an archive from the .git folder, as commented by fantabolous)


Warning: I wouldn't recommend Pat Notz's solution, which is cloning the repo.
Backup many files are always more tricky than backing up or updating... just one.

If you look at the history of edits of the OP Yar answer, you would see that Yar used at first a clone --mirror, ... with the edit:

Using this with Dropbox is a total mess.
You will have sync errors, and you CANNOT ROLL A DIRECTORY BACK IN DROPBOX.
Use git bundle if you want to back up to your dropbox.

Yar's current solution uses a git bundle.

I rest my case.

Diazotize answered 24/1, 2010 at 23:0 Comment(13)
I just checked this and it's actually great. I'll have to try some bundling and unbundling and list-heads to be convinced... but I like it quite a bit. Thanks again, especially for the notes on the --all switch.Macey
Somewhat related, is there anything wrong with just zipping my local repository? I need a single backup file, copying thousands of files on a external drive is incredibly slow. I'm just wondering if there is something more efficient because zip has to archive so many files in the .git folder.Bowing
@faB: the only difference is that you can easily do incremental backup with git bundle. It is not possible with a global zip of the all local repo.Diazotize
@Diazotize Git bundle supports incremental backup and restore? I thought it makes one big file... is this wrong?Macey
@yar: it can also make a small file with only latest changes from x (x being a tag, a date, n revision in a branch, ...). git bundle create mybundle --since=10.days master for instance only backup master branch for the last 10 days, allowing you to fetch from that bundle once you are on the destination workstation where a repo already exist.Diazotize
Thanks for that, I thought perhaps there was a way for git bundle to actually be aware of the last backup.Macey
Thanks @VonC, love the new answer on this one.Macey
But can you push incremental updates to a bundle file and add it as a remote? You can do this to a mirrored bare repo. Though a bundle may be more suitable for uploading to dropbox or giving somebody on a thumb drive, it may be less suitable for other purposes like a personal backup of a large project on a remote system where you don't want to push 100s of megs of data to save the days changes and don't want to mess with multiple incremental patch bundles.Carlisle
@ShadowCreeper I agree. Incremental bundle are possible but tricky to manage.Diazotize
Hey @Diazotize hope you don't mind, I've changed the accepted answer to promote new input on SO. Hope you're well!Macey
@Yar I don't mind at all. Great new input. Great choice. And happy new year :)Diazotize
Replying to an old comment, but another difference between bundle and zipping the dir is bundle is atomic, so it won't get messed up if somebody happens to update your repo in the middle of the operation.Aurlie
@Aurlie good point. I have included it in the answer for more visibility.Diazotize
M
68

The way I do this is to create a remote (bare) repository (on a separate drive, USB Key, backup server or even github) and then use push --mirror to make that remote repo look exactly like my local one (except the remote is a bare repository).

This will push all refs (branches and tags) including non-fast-forward updates. I use this for creating backups of my local repository.

The man page describes it like this:

Instead of naming each ref to push, specifies that all refs under $GIT_DIR/refs/ (which includes but is not limited to refs/heads/, refs/remotes/, and refs/tags/) be mirrored to the remote repository. Newly created local refs will be pushed to the remote end, locally updated refs will be force updated on the remote end, and deleted refs will be removed from the remote end. This is the default if the configuration option remote.<remote>.mirror is set.

I made an alias to do the push:

git config --add alias.bak "push --mirror github"

Then, I just run git bak whenever I want to do a backup.

Microtome answered 24/1, 2010 at 23:32 Comment(4)
+1. Agreed. git bundle is nice to move a backup around (one file). But with a drive you can plug anywhere, the bare repo is fine too.Diazotize
+1 awesme, I'll look into this. Thanks for the examples, too.Macey
@Pat Notz, in the end I decided to go with your way of doing it, and I put an answer below here (score permanently held at zero :)Macey
Note that --mirror doesn't actually run any kind of verification on the objects it gets. You should probably run git fsck at some point to prevent corruption.Monas
M
35

[Just leaving this here for my own reference.]

My bundle script called git-backup looks like this

#!/usr/bin/env ruby
if __FILE__ == $0
        bundle_name = ARGV[0] if (ARGV[0])
        bundle_name = `pwd`.split('/').last.chomp if bundle_name.nil? 
        bundle_name += ".git.bundle"
        puts "Backing up to bundle #{bundle_name}"
        `git bundle create /data/Dropbox/backup/git-repos/#{bundle_name} --all`
end

Sometimes I use git backup and sometimes I use git backup different-name which gives me most of the possibilities I need.

Macey answered 1/2, 2010 at 13:39 Comment(7)
+1 Because you didn't use the --global option this alias will only be seen in your project (it's definded in your .git/config file) -- that's probably what you want. Thanks for the more detailed and nicely formatted answer.Microtome
@yar: do you know how to accomplish these tasks without the commandline and instead only use tortoisegit (am searching for solution for my non-command-line-windoze users)?Vacillating
@pastacool, sorry I don't know about git without the command-line at all. Perhaps check out a relevant IDE like RubyMine?Macey
@intuited, you can roll back DIRECTORIES with spideroak, or just files (which the Dropbox does and they give you 3GB of space)?Macey
@Yar: not sure I understand.. do you mean that if I delete a Dropbox-backed directory, I lose all previous revisions of the files contained in it? More info on spideroak's versioning policies is here. TBH I haven't really used SpiderOak much, and am not totally sure of its limits. It does seem like they would have provided a solution for such problems though, they place a lot of emphasis on technical competence. Also: does Dropbox still have a 30-day limit on rollbacks for free accounts?Tyrelltyrian
@intuited, if you delete a dropbox-backed directory, you can restore its files one by one. It's been a while since I checked, though. Not sure what the limits are for packrat on free accounts, but it doesn't matter: the game is won and Dropbox is crushing everybody else. They're technical enough to make it.Macey
@Yar: I'm pretty sure that SpiderOak works the same way WRT restoring files. I considered Dropbox but decided against them based on the fact that their TOS states that they will delete accounts without warning for suspected copyright infringement. SpiderOak uses client-side encryption and so is incapable of any such transgression.Tyrelltyrian
S
24

I started hacking away a bit on Yar's script and the result is on github, including man pages and install script:

https://github.com/najamelan/git-backup

Installation:

git clone "https://github.com/najamelan/git-backup.git"
cd git-backup
sudo ./install.sh

Welcoming all suggestions and pull request on github.

#!/usr/bin/env ruby
#
# For documentation please sea man git-backup(1)
#
# TODO:
# - make it a class rather than a function
# - check the standard format of git warnings to be conform
# - do better checking for git repo than calling git status
# - if multiple entries found in config file, specify which file
# - make it work with submodules
# - propose to make backup directory if it does not exists
# - depth feature in git config (eg. only keep 3 backups for a repo - like rotate...)
# - TESTING



# allow calling from other scripts
def git_backup


# constants:
git_dir_name    = '.git'          # just to avoid magic "strings"
filename_suffix = ".git.bundle"   # will be added to the filename of the created backup


# Test if we are inside a git repo
`git status 2>&1`

if $?.exitstatus != 0

   puts 'fatal: Not a git repository: .git or at least cannot get zero exit status from "git status"'
   exit 2


else # git status success

   until        File::directory?( Dir.pwd + '/' + git_dir_name )             \
            or  File::directory?( Dir.pwd                      ) == '/'


         Dir.chdir( '..' )
   end


   unless File::directory?( Dir.pwd + '/.git' )

      raise( 'fatal: Directory still not a git repo: ' + Dir.pwd )

   end

end


# git-config --get of version 1.7.10 does:
#
# if the key does not exist git config exits with 1
# if the key exists twice in the same file   with 2
# if the key exists exactly once             with 0
#
# if the key does not exist       , an empty string is send to stdin
# if the key exists multiple times, the last value  is send to stdin
# if exaclty one key is found once, it's value      is send to stdin
#


# get the setting for the backup directory
# ----------------------------------------

directory = `git config --get backup.directory`


# git config adds a newline, so remove it
directory.chomp!


# check exit status of git config
case $?.exitstatus

   when 1 : directory = Dir.pwd[ /(.+)\/[^\/]+/, 1]

            puts 'Warning: Could not find backup.directory in your git config file. Please set it. See "man git config" for more details on git configuration files. Defaulting to the same directroy your git repo is in: ' + directory

   when 2 : puts 'Warning: Multiple entries of backup.directory found in your git config file. Will use the last one: ' + directory

   else     unless $?.exitstatus == 0 then raise( 'fatal: unknown exit status from git-config: ' + $?.exitstatus ) end

end


# verify directory exists
unless File::directory?( directory )

   raise( 'fatal: backup directory does not exists: ' + directory )

end


# The date and time prefix
# ------------------------

prefix           = ''
prefix_date      = Time.now.strftime( '%F'       ) + ' - ' # %F = YYYY-MM-DD
prefix_time      = Time.now.strftime( '%H:%M:%S' ) + ' - '
add_date_default = true
add_time_default = false

prefix += prefix_date if git_config_bool( 'backup.prefix-date', add_date_default )
prefix += prefix_time if git_config_bool( 'backup.prefix-time', add_time_default )



# default bundle name is the name of the repo
bundle_name = Dir.pwd.split('/').last

# set the name of the file to the first command line argument if given
bundle_name = ARGV[0] if( ARGV[0] )


bundle_name = File::join( directory, prefix + bundle_name + filename_suffix )


puts "Backing up to bundle #{bundle_name.inspect}"


# git bundle will print it's own error messages if it fails
`git bundle create #{bundle_name.inspect} --all --remotes`


end # def git_backup



# helper function to call git config to retrieve a boolean setting
def git_config_bool( option, default_value )

   # get the setting for the prefix-time from git config
   config_value = `git config --get #{option.inspect}`

   # check exit status of git config
   case $?.exitstatus

      # when not set take default
      when 1 : return default_value

      when 0 : return true unless config_value =~ /(false|no|0)/i

      when 2 : puts 'Warning: Multiple entries of #{option.inspect} found in your git config file. Will use the last one: ' + config_value
               return true unless config_value =~ /(false|no|0)/i

      else     raise( 'fatal: unknown exit status from git-config: ' + $?.exitstatus )

   end
end

# function needs to be called if we are not included in another script
git_backup if __FILE__ == $0
Serving answered 5/5, 2012 at 3:45 Comment(3)
@Yar Great bundle script, based on the git bundle I advocated for in my answer below. +1.Diazotize
I already installed your app in my local bare repository....how do you use it once it's installed....there's no info regarding that on the documentation, you should include a section withg an example on how to make a backupHippodrome
Hi, sorry you don't get it to work. Normally you run sudo install.sh, then configure it (it uses git config system) to set the destination directory (see the readme file on github). Next you run git backup inside your repository. As a sidenote, this was an experiment with git bundle and a response to this question, but git bundle never makes an absolute exact copy (eg. if i recall well, especially concerning git remotes), so personally I actually use tar to backup .git directories.Serving
A
9

Both answers to this questions are correct, but I was still missing a complete, short solution to backup a Github repository into a local file. The gist is available here, feel free to fork or adapt to your needs.

backup.sh:

#!/bin/bash
# Backup the repositories indicated in the command line
# Example:
# bin/backup user1/repo1 user1/repo2
set -e
for i in $@; do
  FILENAME=$(echo $i | sed 's/\//-/g')
  echo "== Backing up $i to $FILENAME.bak"
  git clone [email protected]:$i $FILENAME.git --mirror
  cd "$FILENAME.git"
  git bundle create ../$FILENAME.bak --all
  cd ..
  rm -rf $i.git
  echo "== Repository saved as $FILENAME.bak"
done

restore.sh:

#!/bin/bash
# Restore the repository indicated in the command line
# Example:
# bin/restore filename.bak
set -e

FOLDER_NAME=$(echo $1 | sed 's/.bak//')
git clone --bare $1 $FOLDER_NAME.git
Actuate answered 17/7, 2015 at 16:39 Comment(2)
Interesting. More precise than my answer. +1Diazotize
Thanks, this is useful for Github. The accepted answer is to the current question.Macey
B
9

Found the simple official way after wading through the walls of text above that would make you think there is none.

Create a complete bundle with:

$ git bundle create <filename> --all

Restore it with:

$ git clone <filename> <folder>

This operation is atomic AFAIK. Check official docs for the gritty details.

Regarding "zip": git bundles are compressed and surprisingly small compared to the .git folder size.

Bloom answered 2/6, 2020 at 12:8 Comment(1)
This doesn’t answer the whole question about zip and also assumes we’ve read the other answers. Please fix it so it’s atomic and handles the whole question and I’m glad to make it accepted answer (10 years later). ThanksMacey
F
6

You can backup the git repo with git-copy . git-copy saved new project as a bare repo, it means minimum storage cost.

git copy /path/to/project /backup/project.backup

Then you can restore your project with git clone

git clone /backup/project.backup project
Foreland answered 3/6, 2015 at 3:45 Comment(1)
Argh! this answer made me believe "git copy" was an official git command.Bloom
J
0

came to this question via google.

Here is what i did in the simplest way.

git checkout branch_to_clone

then create a new git branch from this branch

git checkout -b new_cloned_branch
Switched to branch 'new_cloned_branch'

come back to original branch and continue:

git checkout branch_to_clone

Assuming you screwed up and need to restore something from backup branch :

git checkout new_cloned_branch -- <filepath>  #notice the space before and after "--"

Best part if anything is screwed up, you can just delete the source branch and move back to backup branch!!

Jaimiejain answered 24/9, 2015 at 10:20 Comment(2)
I like this approach - but I'm unsure if it's best practice? I make 'backup' git branches quite often, and eventually I'll have many backup branches. I'm unsure if this is okay or not (having ~20 backup branches from different dates). I guess I could always delete the older backups eventually - but if I want to keep them all - is that okay? So far it's playing nicely - but would be nice to know if it's good or bad practice.Finback
its not something that would be called as best practice, i assume its more related to ones individual habbits of doing stuffs. I generally code in one branch only untill the job is done and keep a another branch for adhoc requests. Both have backups, once done, delete the main branch! :)Jaimiejain

© 2022 - 2024 — McMap. All rights reserved.