git submodule update is slow. How can I debug why it's slow?
Asked Answered
git
E

4

27

I'm using git submodule and git submodule update --init --recursive command is slow (8 sec) although there seems no changes to be updated.

I want to debug why it's slow but it seems there is no --verbose switch. Any idea to debug what it's doing?

I'm running Ubuntu 14.04 and Git 1.9.

Encyclopedic answered 25/12, 2015 at 7:13 Comment(2)
git submodule update essentially checks out the submodule. One possibility for why it takes so long is that the submodule has a lot of binary files. Can you give us more information about what the submodule contains?Shanon
Thank you for the comment. The issue is happening on my client environment, so I'll confirm this.Encyclopedic
E
7

The way I debug the process of updating submodules:

GIT_TRACE=1 GIT_CURL_VERBOSE=1 git submodule update
Emmet answered 21/3, 2022 at 13:10 Comment(0)
J
7

--progress option

Another thing worth trying is the --progress option which shows the usual percentage progress like git clone would:

git submodule update --progress

Related: How to show progress for submodule fetching?

htop, iotop and nethogs

You could also try these three commands which monitor CPU, disk and Internet usage correspondingly to try and determine the bottleneck.

Jimjimdandy answered 22/9, 2022 at 13:31 Comment(0)
C
5

Since there is no change to actually checkout and copy, that leaves two main root causes:

Caudate answered 26/12, 2015 at 7:43 Comment(0)
C
3

With Git 2.20 Q4 2018), git submodule will be notably faster because "git submodule update" is getting rewritten piece-by-piece into C.

See commit ee69b2a, commit 74d4731 (13 Aug 2018), and commit c94d9dc, commit f1d1571, commit 90efe59, commit 9eca701, commit ff03d93 (03 Aug 2018) by Stefan Beller (stefanbeller).
(Merged by Junio C Hamano -- gitster -- in commit 4d6d6ef, 17 Sep 2018)


Git 2.21 actually fixes a regression, since "git submodule update" ought to use a single job unless asked, but by mistake used multiple jobs, which has been fixed.

See commit e3a9d1a (13 Dec 2018) by Junio C Hamano (gitster).
(Merged by Junio C Hamano -- gitster -- in commit 4744d03, 18 Jan 2019)

submodule update: run at most one fetch job unless otherwise set

In a028a19 (fetching submodules: respect submodule.fetchJobs config option, 2016-02-29, Git v2.9.0-rc0), we made sure to keep the default behavior of fetching at most one submodule at once when not setting the newly introduced submodule.fetchJobs config.

This regressed in 90efe59 (builtin/submodule--helper: factor out submodule updating, 2018-08-03, Git v2.20.0-rc0). Fix it.


And Git 2.21 fixes the core.worktree setting in a submodule repository, which should not be pointing at a directory when the submodule loses its working tree (e.g. getting deinit'ed), but the code did not properly maintain this invariant.

See commit 8eda5ef, commit 820a647, commit 898c2e6, commit 98bf667 (14 Dec 2018) by Stefan Beller (stefanbeller).
(Merged by Junio C Hamano -- gitster -- in commit 3942920, 18 Jan 2019)

submodule deinit: unset core.worktree

When a submodule is deinit'd, the working tree is gone, so the setting of core.worktree is bogus.
Unset it.
As we covered the only other case in which a submodule loses its working tree in the earlier step (i.e. switching branches of top-level project to move to a commit that did not have the submodule), this makes the code always maintain core.worktree correctly unset when there is no working tree for a submodule.

This re-introduces 984cd77 (submodule deinit: unset core.worktree, 2018-06-18, Git v2.19.0-rc0), which was reverted as part of f178c13 (Revert "Merge branch 'sb/submodule-core-worktree'", 2018-09-07, Git v2.19.0)

The whole series was reverted as the offending commit e983175 (submodule: ensure core.worktree is set after update, 2018-06-18, Git v2.19.0-rc0) was relied on by other commits such as 984cd77.

Keep the offending commit reverted, but its functionality came back via 4d6d6ef (Merge branch 'sb/submodule-update-in-c', 2018-09-17), such that we can reintroduce 984cd77 now.


Git 2.21 also includes "git submodule update" learning to abort early when core.worktree for the submodule is not set correctly to prevent spreading damage.

See commit 5d124f4 (18 Jan 2019) by Stefan Beller (stefanbeller).
(Merged by Junio C Hamano -- gitster -- in commit e524e44, 07 Feb 2019)

git-submodule: abort if core.worktree could not be set correctly

74d4731 (submodule--helper: replace connect-gitdir-workingtree by ensure-core-worktree, 2018-08-13, Git 2.20) forgot to exit the submodule operation if the helper could not ensure that core.worktree is set correctly.


Warning: CVE-2019-1387: Recursive clones are currently affected by a vulnerability that is caused by too-lax validation of submodule names, allowing very targeted attacks via remote code execution in recursive clones.

Fixed in Git 2.24.1, 2.23.1, 2.22.2, 2.21.1, 2.20.2. 2.19.3, 2.18.2, 2.17.3, 2.16.6, 2.15.4 and 2.14.6

In conjunction with a vulnerability that was fixed in v2.20.2, .gitmodules is no longer allowed to contain entries of the form submodule.<name>.update=!command.

submodule: reject submodule.update = !command in .gitmodules

Reported-by: Joern Schneeweisz
Signed-off-by: Jonathan Nieder
Signed-off-by: Johannes Schindelin

Since ac1fbbda2013 ("submodule: do not copy unknown update mode from .gitmodules", 2013-12-02, Git v1.8.5.1 -- merge), Git has been careful to avoid copying:

[submodule "foo"]
    update = !run an arbitrary scary command

from .gitmodules to a repository's local config, copying in the setting 'update = none' instead.

The gitmodules(5) manpage documents the intention:

The !command form is intentionally ignored here for security reasons

Unfortunately, starting with v2.20.0-rc0 (which integrated ee69b2a9 (submodule--helper: introduce new update-module-mode helper, 2018-08-13, first released in v2.20.0-rc0)), there are scenarios where we don't ignore it: if the config store contains no submodule.foo.update setting, the submodule-config API falls back to reading .gitmodules and the repository-supplied !command gets run after all.

This was part of a general change over time in submodule support to read more directly from .gitmodules, since unlike .git/config it allows a project to change values between branches and over time (while still allowing .git/config to override things).

But it was never intended to apply to this kind of dangerous configuration.

The behavior change was not advertised in ee69b2a9's commit message and was missed in review.

Let's take the opportunity to make the protection more robust, even in Git versions that are technically not affected: instead of quietly converting 'update = !command' to 'update = none', noisily treat it as an error.

Allowing the setting but treating it as meaning something else was just confusing; users are better served by seeing the error sooner.

Forbidding the construct makes the semantics simpler and means we can check for it in fsck (in a separate patch, see below).

As a result, the submodule-config API cannot read this value from .gitmodules under any circumstance, and we can declare with confidence

For security reasons, the '!command' form is not accepted here.

And:

fsck: reject submodule.update = !command in .gitmodules

Reported-by: Joern Schneeweisz
Signed-off-by: Jonathan Nieder
Signed-off-by: Johannes Schindelin

This allows hosting providers to detect whether they are being used to attack users using malicious 'update = !command' settings in .gitmodules.

Since ac1fbbda2013 ("submodule: do not copy unknown update mode from .gitmodules", 2013-12-02, Git v1.8.5.1 -- merge), in normal cases such settings have been treated as 'update = none', so forbidding them should not produce any collateral damage to legitimate uses.

A quick search does not reveal any repositories making use of this construct, either.

And:

submodule: defend against submodule.update = !command in .gitmodules

Signed-off-by: Jonathan Nieder
Signed-off-by: Johannes Schindelin

In v2.15.4, we started to reject submodule.update settings in .gitmodules. Let's raise a BUG if it somehow still made it through from anywhere but the Git config.


The "--recurse-submodules" option of various subcommands did not work well when run in an alternate worktree, which has been corrected with Git 2.25.2 (March 2020).

See commit a9472af, commit 129510a, commit 4eaadc8, commit 773c60a (21 Jan 2020) by Philippe Blain (phil-blain).
(Merged by Junio C Hamano -- gitster -- in commit ff5134b, 05 Feb 2020)

submodule.c: use get_git_dir() instead of get_git_common_dir()

Signed-off-by: Philippe Blain

Ever since df56607dff (git-common-dir: make "modules/" per-working-directory directory, 2014-11-30, Git v2.5.0-rc0), submodules in linked worktrees are cloned to $GIT_DIR/modules, i.e. $GIT_COMMON_DIR/worktrees/<name>/modules.

However, this convention was not followed when the worktree updater commands checkout, reset and read-tree learned to recurse into submodules.
Specifically, submodule.c::submodule_move_head, introduced in 6e3c1595c6 (update submodules: add submodule_move_head, 2017-03-14, Git v2.13.0-rc0) and submodule.c::submodule_unset_core_worktree, (re)introduced in 898c2e65b7 ("submodule: unset core.worktree if no working tree is present", 2018-12-14, Git v2.21.0-rc0 -- merge listed in batch #3) use get_git_common_dir() instead of get_git_dir() to get the path of the submodule repository.

This means that, for example, 'git checkout --recurse-submodules <branch>' in a linked worktree will correctly checkout <branch>, detach the submodule's HEAD at the commit recorded in <branch> and update the submodule working tree, but the submodule HEAD that will be moved is the one in $GIT_COMMON_DIR/modules/<name>/, i.e. the submodule repository of the main superproject working tree.
It will also rewrite the gitfile in the submodule working tree of the linked worktree to point to $GIT_COMMON_DIR/modules/<name>/.
This leads to an incorrect (and confusing!) state in the submodule working tree of the main superproject worktree.

Additionally, if switching to a commit where the submodule is not present, submodule_unset_core_worktree will be called and will incorrectly remove 'core.wortree' from the config file of the submodule in the main superproject worktree, $GIT_COMMON_DIR/modules/<name>/config.

Fix this by constructing the path to the submodule repository using get_git_dir() in both submodule_move_head and submodule_unset_core_worktree.


Before Git 2.27 (Q2 2020), the "git submodule" command did not initialize a few variables it internally uses and was affected by variable settings leaked from the environment.

See commit 65d100c (02 Apr 2020) by Li Xuejiang (xuejiangLi).
(Merged by Junio C Hamano -- gitster -- in commit 27dd34b, 28 Apr 2020)

git-submodule.sh: setup uninitialized variables

Helped-by: Jiang Xin
Signed-off-by: Li Xuejiang

We have an environment variable jobs=16 defined in our CI system, and this environment makes our build job failed with the following message:

error: pathspec '16' did not match any file(s) known to git

The pathspec '16' for Git command is from the environment variable "jobs".

This is because "git submodule" command is implemented in shell script, and environment variables may change its behavior.

Set values for uninitialized variables, such as "jobs" and "recommend_shallow" will fix this issue.

Caudate answered 22/9, 2018 at 0:57 Comment(4)
doesn't really read like an answer, and it's still slow today :DHerniotomy
@David天宇Wong What version of Git are you using? On which OS? I suspect if the update remains slow, it could be linked to the remote repository of the submodule, slow to answer to fetch query.Caudate
Last version of git on Macos. The remote is Github in my case.Herniotomy
@David天宇Wong OK. Can you activate GIT_TRACE_PERFORMANCE for pinpoint what is slow in that command?Caudate

© 2022 - 2024 — McMap. All rights reserved.