Why is git submodule not updated automatically on git checkout?
Asked Answered
W

3

93

When switching branches with git checkout I would assume that most of the time you would want to update your submodules.

  • In what situation do you not want to update submodules after switching?
  • What would break if this was done automatically by git checkout?

Updated with example:

  • Branch A has submodule S at 3852f1
  • Branch B has submodule S at fd72d7

On branch A, git checkout B will result in a working copy of branch B with submodule S at 3852f1 (with a modified S). git submodule update will checkout S at fd72d7.

Warfeld answered 14/12, 2009 at 9:5 Comment(1)
Related: stackoverflow.com/questions/4611512/…Devlin
K
53

git checkout --recurse-submodules was added to git 2.13

This is mentioned on the release notes at: https://github.com/git/git/commit/e1104a5ee539408b81566066aaa6963cb87d5cd6#diff-c24776ff22455a30fbb78e378b7df0b0R139

submodule.recurse option was added to git 2.14

Set as:

git config --global submodule.recurse true

man git-config says:

Specifies if commands recurse into submodules by default. This applies to all commands that have a --recurse-submodules option. Defaults to false.

I feel that not updating modules by default is a bad Git default behavior that goes against most user's expectations and limits the adoption of submodules, I really wish the devs would change it.

submodule.recurse makes git fetch fetch all submodules every time

This makes the option basically unusably slow, because the fetch happens even when the submodules are up to date, as it tries to fetch an branch updates from those submodules.

I do a lot of fetching to see my co-workers commits so I have to fetch often. And submodules tend do move much more slowly compared to the toplevel repo, so fetching them all the time is not necessary.

So for now I stick to the Bash function workaround:

gco() { git checkout --recurse-submodules "$@" }

This workaround is not perfect as there are other commands that can change commit, e.g. git rebase, so you end up having to define multiple aliases. I don't have a better solution for this right now:

gsuu() {
  git submodule sync --recursive &>/dev/null
  git submodule update --init --recursive --progress "$@"
}
grb() {
  git rebase "$@"
  gsuu
}
Krahmer answered 8/5, 2017 at 18:13 Comment(4)
submodule.recurse seems to work starting with git 2.14.Mantua
I have submodule.recurse set to true, but I find there are still times (jumping back and forth across the addition of a submodule, I think?) where I have to do git submodule update --init --recursive after my git checkout. Is there a way to make that happen automatically?Senator
Since there's a configuration setting, they should enable it by default and let anyone who (somehow) preferred manual updating opt-out instead.Syrup
@Syrup I agree, and I suggest that if you feel strongly enough about this, you could voice your opinion to the Git development community.There is also a recent patch that suggests setting submodule.recurse when git clone --recurse-submoodules is used. You could voice your opinion on that suggestion on the mailing list also :)Cetinje
V
28

I believe that the submodules not updating automatically is in line with the development goals of Git. Git is meant to work in a distributed mode and doesn't presume that you are even able to connect to a non-local repository unless you explicitly tell it to. Git not auto-refreshing a submodule would be the expected behavior when thought of that way.

With that being said, if you know that you always want those sub-modules to be pulled in and you know that you would never branch off of those submodules to another local repository, then it shouldn't break anything if you automatically refreshed them after a checkout.

Venusian answered 14/12, 2009 at 9:13 Comment(6)
I think git checkout should just complain if the commit for a submodule was unavailable instead of leaving the working directory in an inconsistent state by default. Then you could do git submodule update to fetch the referenced commit. Again, normally the commit will be available and the checkout can be done without any network access. Accepting your answer since it sounds reasonable (but I don't like it ;)Warfeld
I second the notion that git should try to do a submodule init and update on the initial checkout and complain and show in status if a submodule exists which hasn't been pulled locally the first time. After you have it once the notion of needing to explicitly update makes sense because the repos are distinct and the submodule references a specific commit. But even in a distributed world where it might have been unavailable you will most likely want it at some point and git should let you know it was never pulled.Book
git fetch has an option to automatically fetch submodules, so likewise I think that checkout should have a similar option to automatically update/checkout submodules.Salo
"Git not auto-refreshing a submodule would be the expected behavior when thought of that way." - I don't see why. Indeed, I don't see how any of the points in your first paragraph have any relevance to the matter of whether Git should automatically attempt to update submodules when pulling.Topflight
Git doesn't presume that you have access to the repository you checked out from at all times when you want to perform work on the repository. Instead, once cloned, your repository can act completely on it's own. If the starting assumption is that you only connect to foreign repositories when told to, then it falls out that Git shouldn't auto-refresh a submodule which was what I was trying to convey.Venusian
Right, but I would expect git fetch to make sure all contained submodules have fetched all of their commits that are referenced by all of the commits in the containing repository. If so, then any subsequent top-level git checkout would be able to check out the correct submodule commits without requiring network access.Senator
A
4

With Git 2.27 (Q2 2020), the "--recurse-submodules" option is better documented.

See commit acbfae3, commit 4da9e99, commit d09bc51, commit b3cec57, commit dd0cb7d (06 Apr 2020) by Damien Robert (damiens-robert).
(Merged by Junio C Hamano -- gitster -- in commit cc908db, 28 Apr 2020)

doc: --recurse-submodules mostly applies to active submodules

Signed-off-by: Damien Robert
Helped-by: Philippe Blain

The documentation refers to "initialized" or "populated" submodules, to explain which submodules are affected by '--recurse-submodules', but the real terminology here is 'active' submodules. Update the documentation accordingly.

Some terminology:

  • Active is defined in gitsubmodules(7), it only involves the configuration variables 'submodule.active', 'submodule.<name>.active' and 'submodule.<name>.url'.
    The function submodule.c::is_submodule_active checks that a submodule is active.
  • Populated means that the submodule's working tree is present (and the gitfile correctly points to the submodule repository), i.e. either the superproject was cloned with --recurse-submodules, or the user ran git submodule update --init, or git submodule init [<path>] and git submodule update [<path>] separately which populated the submodule working tree.
    This does not involve the 3 configuration variables above.
  • Initialized (at least in the context of the man pages involved in this patch) means both "populated" and "active" as defined above, i.e. what [git submodule update --init](https://git-scm.com/docs/git-submodule) does.

The --recurse-submodules option mostly affects active submodules.

An exception is git fetch where the option affects populated submodules.
As a consequence, in git pull --recurse-submodules the fetch affects populated submodules, but the resulting working tree update only affects active submodules.

In the documentation of git-pull, let's distinguish between the fetching part which affects populated submodules, and the updating of worktrees, which only affects active submodules.


With Git 2.33 (Q3 2021), the documentation for submodule.recurse is clearer:

See commit 878b399 (16 Jul 2021) by Philippe Blain (phil-blain).
(Merged by Junio C Hamano -- gitster -- in commit c018818, 02 Aug 2021)

doc: clarify description of 'submodule.recurse'

Signed-off-by: Philippe Blain

The doc for 'submodule.recurse' starts with "Specifies if commands recurse into submodles by default".
This is not exactly true of all commands that have a '--recurse-submodules' option.
For example, 'git pull --recurse-submodules'(man) does not run 'git pull'(man) in each submodule, but rather runs 'git submodule update --recursive'(man) so that the submodule working trees after the pull matches the commits recorded in the superproject.

Clarify that by just saying that it enables '--recurse-submodules'.

Note that the way this setting interacts with 'fetch.recurseSubmodules' and 'push.recurseSubmodules', which can have other values than true or false, is already documented since 4da9e99 ("doc: be more precise on (fetch|push).recurseSubmodules", 2020-04-06, Git v2.27.0-rc0 -- merge listed in batch #4).

git config now includes in its man page:

A boolean indicating if commands should enable the --recurse-submodules option by default. Applies to all commands that support this option.

Azide answered 1/5, 2020 at 21:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.