What's the fastest way to work in git over a slow network connection?
Asked Answered
M

3

18

Here's the scenario: At work we have quite a few branches and we haven't kept the repo as tidy as we should, occasionally adding/removing large files or whatnot, and rarely removing dead branches.

So today is a snow day and I have to work from home. I have a slow VPN connection, and all I need is the fastest way to get to the one branch I care about and start working, with the ability to push commits back.

In SVN, I would have just updated the paths/files I needed and would have been working in no time. Like most git newbies, I only have a handful of trusted commands, and my fallback git clone or git pull are going to be too slow.

So it seems to be a two part question:

  1. How do I clone a repo to get working as quickly as possible, and
  2. How do I pull/push from this repo (edit, commit, pull, push)

Working solution (per @g19fanatic's suggestions below):

> git clone -b <branchname> <repo_url> --depth=1
remote: Counting objects: 16679, done.
remote: Compressing objects: 100% (11926/11926), done.
remote: Total 16679 (delta 6936), reused 10919 (delta 3337)
Receiving objects: 100% (16679/16679), 628.12 MiB | 430 KiB/s, done.
Resolving deltas: 100% (6936/6936), done.
> git pull
Already up-to-date.

(make small change on other machine, commit/push)

> git pull
remote: Counting objects: 5, done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 5 (delta 0), reused 0 (delta 0)

Excellent, this left me with a working repo.

The only issue is that it transferred twice as much data initially as the below failed attempts did, but it did leave the repo in a usable state. I'll considered this answered, but I think there could be improvement on the initial transfer size.

Monoploid answered 2/11, 2011 at 18:19 Comment(1)
still a pertinent issue today! but, indeed, still not a good answer. my repo is tidy, my connection is much worse and i just need better async operations! also, this could be greatly improved by following the site standards... just leaving comments and answers where they belong.Gallium
R
14

Doing a little test here, i was able to do the following...

We have a shared --bare repo out on the network

i have the remote setup as git remote add origin <pathToSharedRepo>
i do a git pull origin <branch> --depth=1 (not at git fetch but git pull)

this successfully pulls only the "HEAD + depth" commits for this branch. I can do commits to this and the such and then push (git push should work just fine) it back out without issue.

to pull the new commits and JUST the new commits from the shared repo, i have to explicitly type git pull origin <branch>. Since this is the way I originally did the pull (explicitly), I have to do the same this time...

This should not pull down any more history than the depth you originally set (it doesnt care)


To be complete, you can also do a depth setting when you're cloning the repo:
git clone -b <branch> <pathToRepo> --depth=<numCommitsWanted>

Ribbing answered 2/11, 2011 at 18:48 Comment(7)
my local repo is setup like the following:mkdir newRepo; cd newRepo then git init THEN the git remote add...Ribbing
So I tried this, and it doesn't appear to be working in my case (see additional info in the question). You mention a --bare repo - does that have any bearing on this? Any ideas why it isn't pulling only the small set of deltas?Monoploid
I just tried the whole process on an original repo that wasn't inited as a bare repo and I still got the same process to work without issue. I'm still only pulling down the initial branch + 1 commit history, if i try to do a git pull origin <branch> RIGHT after i do the first pull with --depth=1 it states "Already up-tp-date." When I do some commits on the original repo (pushed from somewhere else) and do a git pull origin <branch>, i only get the new commits and not any more history...Ribbing
git clone -b <branch> file://<pathToRepo> --depth=1 managed to do the same thing as the above PLUS it automatically linked up the local repo with the remote one... give that one a try.Ribbing
Ok, I'll definitely try this again when I get home tonight. Thanks again for your help!Monoploid
git clone -b <branch> <repo_url> --depth=1 finally worked (see above). Not sure why the others didn't. Thanks!Monoploid
as per my other comment, this is still far from enough. sadly.Gallium
J
3

There are a few ways to reduce bandwidth:

  • clone only the branch you need (--single-branch, this one is only for cloning, not pulling; when pulling you can specify the branch you need)
  • clone only the most recent versions of files (--depth 1, just like you did). This one implies --single-branch by default.
  • clone only the files you need (a.k.a. sparse checkout, like here)

Additionally, if pulling keeps failing, I use a bash script like this to keep retrying until finished successfully:

#!/bin/bash

until $( git pull --depth=1 origin master ); do        # <-- pull command goes here
    echo "Pulling repository failed; retrying..."
done

Of course, before pulling, you'll need initialize the repo first:

git init <dir>
cd <dir>
git remote add origin <repo_url>
Juanajuanita answered 1/2, 2019 at 4:50 Comment(1)
Thanks for the notes -- I'll try some of these next time I have the need!Monoploid
M
1

This "answer" is merely a historical record of my own failed attempts to solve this problem (though the clone operations below transferred less data than the current working solution.)


First failed attempt:

Part 1. seems to be best solved by:

Instead of git cloning the entire repo with all its branches and history, create a new, empty repo and fetch the one branch I care about with a depth of 1 (no history):

mkdir <project_name>
cd <project_name>
git init
git fetch --depth=1 <repo_url> <branchname>:refs/remotes/origin/<branchname>
git checkout <branchname>

This was great, as it performed a much smaller network transfer than a full git clone or pull would have.

But now I'm having problems with part 2) pulling and pushing from this shallow repository. My coworkers are making small updates throughout the day, as am I, so it should be possible to quickly pull and push these little incremental changes. But when I try to setup the branch as tracking the remote, git pull attempts to pull the full history. Even running pull or fetch with --depth 1 seems to want to transfer over entire snapshots again (instead of little incremental changes).

So what can I do in such a situation? (Aside from the obvious - clean up the repo, removing old history items and dead branches.)

Second failed attempt (per @g19fanatic's suggestions below):

Going with @g19fanatic's suggestion, I created a repo using

> mkdir <project_name>
> cd <project_name>
> git init
> git remote add origin <repo_url>
> git pull origin <branchname> --depth=1
remote: Counting objects: 9403, done.
remote: Compressing objects: 100% (6675/6675), done.
remote: Total 9403 (delta 2806), reused 7217 (delta 2136)
Receiving objects: 100% (9404/9403), 325.63 MiB | 206 KiB/s, done.
Resolving deltas: 100% (2806/2806), done.
...

This created a tracking branch and properly pulled only a history of 1 branch (~9400 objects, 325MB, whereas the full repo is ~46k objects). However, again, I can't seem to git pull without pulling more information than I believe is necessary to pull. I think I should be able to pull my coworkers commits in just a few objects and a few kilobytes. But Here's what I see:

> git pull origin <branchname>
remote: Counting objects: 45028, done.
remote: Compressing objects: ... ^C

This was going to pull all the objects in the whole repo, so I broke it. I tried the pull with the --depth=1 argument:

> git pull origin <branchname> --depth=1
remote: Counting objects: 9870, done.
remote: Compressing objects: 100% (7045/7045), done.
Receiving objects:   4% (430/9870), 4.20 MiB | 186 KiB/s ^C

9k+ objects was going to be similar to the initial pull, but I gave it a bit because I thought maybe some of those objects would already exist locally. However, after it transferred 4+ MB, I broke this command because it seems to be making the entire transfer again. Remember, I expect small updates from my coworkers, and I don't have time to pull 300MB every time.

Monoploid answered 19/6, 2020 at 19:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.