What are the drawbacks to setting git's submodule.recurse config option to true?

Asked 28/11, 2018 at 7:49 Answered 14/12, 2022 at 21:6

This question Is there a way to make git pull automatically update submodules? has an accepted answer of configuring git like so:

git config --global submodule.recurse true

Like one of the comments to that answer, I'm wondering why this isn't the default behavior of git; more precisely, what are the drawbacks of setting this configuration option?

Wellordered answered 28/11, 2018 at 7:49 Comment(5)

IMO, you should not always want to update automatically submodules as they may break your application while developing. I'd prefer to update them once I'm sure about my code. If this was the default behavior, that would probably give me some hard time debugging. – Avertin 4/12, 2018 at 23:19

@Avertin Could you elaborate how they could "break your application while developing"? Automatically updating submodules doesn't mean updating them to whatever the latest version is; it means updating them to the commit hashes specified by the parent repository. The submodules should be stable while you're developing. In contrast, not updating them could break things if the submodules become out-of-sync from what the parent repository expects. – Mediacy 5/6, 2021 at 22:3

@Mediacy What I meant here was that if you have local changes that you care of in your submodules, using that kind of config could arm your local development because of not knowing what a pull have done ; whereas using an explicit flag when pulling makes sure you know what you do. This is IMO important because not all developers working on your repo will catch this kind of settings. – Avertin 6/6, 2021 at 23:39

I don't see how that's different for a submodule. Doing a pull on any repository risks breaking uncommitted local changes, doesn't it? – Mediacy 7/6, 2021 at 1:29

@Brewal: "not knowing what a pull have done" - I generally fetch instead of pull so I can see what would be pulled in. After the fetch I run a git log ..@{upstream} (I have a git unpulled alias that simplifies the command and does a custom one-line output) which shows the unpulled commits. After I understand what's about to happen, I'll git pull (actually merge to avoid a needless fetch) to complete the pull. – Wellordered 8/3, 2023 at 6:41

This option was introduced in commit 046b482, initially for working tree manipulating commands (read-tree/checkout/reset)

git grep/fetch/pull/push soon followed.
However, as the documentation mentions, unlike the other commands below clone still needs its own recurse flag: git clone --recurse-submodules <URL> <directory>.
See this recent discussion:

This was a design decision once it was introduced, as the git clone might be too large.
Maybe we need to revisit that decision and just clone the submodules if submodule.recurse is set.

As the number/size of the submodules invoved can be potentially large, the default behavior is, for now, to not include them recursively by default.

Them main drawback is the possible time overhead introduced by having to go recursively in each submodule (and their own respective submodules).
If you have many of them, and don't need all of them, it is best leaving that option off, and specifying --recursive when you need it.

Yet, one advantage is to avoid seeing "Untracked files" when switching branches, as seen in this discussion.

Warning, starting Git 2.34 (Q4 2021), a git clone --recurse-submodules, means a simple git pull will recurse into submodules.

Even when git config --global submodule.recurse is not set.

See "Is there a way to make git pull automatically update submodules?".

Symmetrize answered 4/12, 2018 at 22:53 Comment(3)

I think it's --recurse-submodules and not --recursive – Camellia 22/5, 2019 at 16:6

@Camellia I think I was referencing git-scm.com/docs/git-submodule#Documentation/… – Symmetrize 22/5, 2019 at 16:13

@Camellia But you are correct for git clone: It is --recurse-submodules indeed: git-scm.com/docs/git-clone#Documentation/… – Symmetrize 22/5, 2019 at 16:27

I have two reasons why lately I reverted to back to recurse=false:

I have submodules (3rd party libraries from github) that my project needs, but they have their submodules that I don't need. Namely - several instances o gtest. I don't build tests for 3rd party libraries so fetching them is waste of time and space
I had problems if submodule remote changed in repo. This happens usually when I decide that library needs some modification and pushing to upstream is infeasible. I didn't really investigate it as it is always easier to clone from scratch than mess with submodules. recurse=false seems to have mitigated it, but again: I didn't investigate the problem fully.

Idiocy answered 20/11, 2020 at 7:52 Comment(0)

Here's the thing: people will tend to take the defaults unless their workflow fails to operate correctly, and Git's complexity is imho entirely due to the workflow variety it supports.

Your workflow's going to vary wildly depending on whether you're carrying patches on a vendor base (and this also varies on whether that's in your toplevel repo or one or more submodules), what you're trying to do with your code (develop new features? test an upgrade? just plain fetch-and-build?), how the project's set up (are all submodules always required to build every config? splitting optional features out into separate histories can pay off bigtime), everything.

So calling defaults that do or don't match a workflow you're using a "downside" seems to me to miss the forest for the trees. No matter how the factory defaults are set, they're likely to be suboptimal for your workflow in at least some of your repos precisely because of the vast variety of workflows Git serves. Make the defaults right to suit you, somebody else is going to show up asking why they default to fetching everything recursively.

The only thing I can unambiguously call a downside of setting auto-recurse as a default is this: given that it's anybody's guess whether that setting will match the workflow in any particular repo, and the factory default has to be a guess, it's worse to guess the more expensive option.

It's easy to turn the auto-recurse config on for all your work, but if you don't have to do even that you might waste an awful lot of time doing clones you had no idea weren't needed at all. Local frontends or reference depots for large shared vendor histories are easy.

Disinterest answered 6/12, 2018 at 16:59 Comment(0)

When cloning or pulling a repository containing submodules the submodules will not be checked out by default; You can instruct clone to recurse into submodules. The init and update subcommands of git submodule will maintain submodules checked out and at an appropriate revision in your working tree. Alternatively you can set submodule.recurse to have checkout recursing into submodules.

Source:git-scm.com

The drawback of this option is cloning or pulling a repository containing multiple submodules is the cost in performance.

Gravelblind answered 11/12, 2018 at 10:19 Comment(0)

Well, it seems to connect to and fetch submodules even if the commit hashes already match, unnecessarily. So that kinda sucks.

Other than that it should be fine, because everything is pinned by commit.

Hogwash answered 14/12, 2022 at 21:6 Comment(0)

Recommended topics

Hot tags