How to guarantee git submodule is checking out a specific tag?
Asked Answered
D

1

7

Here's my situation:

We're trying to convert a project we're working on to use submodules. Since we were testing before, the submodules were simply referring to the develop branch. Now I would like to convert all the modules to point from the develop branch to a specific release tag on the master branch. One of the modules also needed to have the remote URL changed so I followed these instructions to modify the .gitmodule file and update the remote url and branch; I even executed the comment that said I need to git submodule update --init --recursive --remote from the root level directory. Then I followed these instructions, going into each submodule, checking out the respective release tag in each.

This seemed to work fine but when I ran git status, it complained about having a detached head (I suppose this is expected since I checked out the tag on the branch and not the actual branch itself). However, I then git added all submodules and committed to my local repo but have not yet pushed because I'm not actually sure the submodules are pointing the the desired tags.

So, how can I verify that when a coworker clones this project, it will pulls the submodules at the state referred to by the release tag I have hopefully set? I tried looking in the .gitmodule file hoping to see a "tag" value but nothing. I also checked the individual .git/module/sub_module_path/config files and see nothing in here as well.

How can I be sure the submodules will be cloned at the desired tag?

Duda answered 1/6, 2021 at 20:16 Comment(0)
H
19

Submodules never "point to tags" (that's not possible in general, as you've discovered: checking out a tag name just results in a detached HEAD). Note that a tag name is just a human-readable name for one specific commit. Any time you use a tag, you could just resolve the tag to the right commit hash ID, and then use the hash ID directly.

The point of submodules, as far as Git is concerned, is to be in detached HEAD mode. It's the superproject Git that says which commit to use. The superproject commit—the one actually checked out right now, in the superproject—lists the raw commit hash ID for each submodule. The superproject Git then does:

git -C path/to/submodule checkout <hash>

using the hash provided by the superproject. So that's "just as good" as a tag: we've simply stored the hash ID in a commit in the superproject, rather than storing it in a tag name in the submodule.

This means we can use a submodule that we don't control. We can't create new tag names in that submodule, but we can git checkout any commit we like, and then create a new commit in the superproject that says check out commit hash in path/to/submodule. And then we're done.

The only remaining question, really, is this one:

  • OK, so we have a superproject, and some commit a123456 that, when that commit is checked out and we run git submodule update --init, checks out that tag.

  • But now we're going to make some new and improved commit(s) in the superproject.

  • In at least one of these new-and-improved commits—say, the next release—we want one particular submodule, sub/mod, to be at v3.1415926 (which is commit feedc0ffee), not the old and lousy v1.4142136 (commit badcab1e). So how do we make sure these new commits in our superproject use hash ID feedc0ffee?

The answer is: simply check out the desired commit in the subproject, e.g.:

git -C sub/mod checkout v3.1415926

and then run git add in the superproject:

git add sub/mod

So what's all this then?

git submodule update --init --recursive --remote

The --remote argument here means: I have some name(s) stored somewhere(s). For each submodule, do a git -C path/to/submodule fetch—that's the first part of the --remote part—and then git -C path/to/submodule checkoutname with the stored name. That's the second part of the --remote. Do this recursively, i.e., if the submodule itself is a superproject to more submodules, do it to those submodules too. That's the --recursive part.

This is a pretty powerful thing, with multiple moving parts:

  • Where is the name stored? How will you know which name is used per submodule?
  • Who's controlling how name resolves to a hash ID? We run git fetch in the submodule, so it's the submodule's remote!
  • What hash IDs will we get? That depends on what names, if any, git fetch in that submodule updates.

These questions all do have answers, but only the "where is the name stored" one should (or can) be answered here. The name comes from:

  • the superproject's .git/config, if it's set there; or
  • the superproject's .gitmodules, if it's set there; or
  • master.

Except for the hardcoded master, these all have further control knobs: you can use git config or git submodule to update the .git/config and/or .gitmodules.

It's all pretty complicated, and fairly delicate as well since it's easy for someone to set it up in their own .git/config and forget to update .gitmodules, for instance. Then you, if you're not this someone, will get the wrong name! For this reason, I generally recommend just doing it all manually, if you're the one who will choose which submodule goes on which commit anyway.

Hedva answered 1/6, 2021 at 22:18 Comment(2)
Very in depth answer! My problem is similar to OPs. Specifically, I have a repo that is mirrored, but in such a way that tags don't point to the same hashes (yes, it's not what I want, but the mirroring causes the commit messages to change, and that changes the hash). The existing work uses a submodules, and it's necessary that the mirror have the same reference from commits on the super project to commits on the submodules. I think my options are to reconnect the wormhole or sidestep submodules and release the components and get them as normal dependencies.Jacintajacinth
Wonderful explanation of benefits and consequences of using git submodulesChamfron

© 2022 - 2024 — McMap. All rights reserved.