How to remove a submodule going forward, but keep its history (as linked from parent history)?
Asked Answered
S

1

3

Say I have a project that has a dependency implemented using git submodule. Now I'm making a change where this dependency is no longer needed. I want to commit a change that works as follows:

  • If anyone checks out this commit or any descendants, the submodule doesn't exist.
  • But if anyone checks out an older commit, or a commit on another branch not merged with this one, the submodule reappears, just as a deleted file would.
  • The submodule's own git database (.git/modules/path/to/submodule) must be preserved as it may contain commits not pushed to a remote.

In other words, I do NOT want to obliterate the submodule as directed by the answers to How do I remove a submodule?. In fact I wrote this question as a counterpoint to clarify that one.[1]

When I get some time I will try some experiments. It may be as simple as git submodule deinit and/or removing its entry from .gitsubmodules. I searched Stack Overflow and found no questions or answers addressing this case specifically. Even the superbly written Mastering Git submodules is not clear about this.


[1]: The many steps required in those answers tells me that such obliteration is not "normal", otherwise git would include a porcelain command that did it all for you. Instead git deinit is provided with very narrow behavior. I think it very intentional.

Stillness answered 22/5, 2020 at 20:28 Comment(3)
Your third bullet-point requirement is the really tricky part. It cannot be guaranteed in any current design, because a submodule repository is not part of its superproject.Riyadh
@Riyadh is it tricky for good reason -- i.e. it's an unreasonable requirement in the first place, use subtree or something else? Or for bad reason: a gap in the design for something that should be handled?Stillness
It's tricky because the original design, as VonC says, assumes that your submodule clone has no value: that it can be thrown away at any time because you can just re-clone it at any time, without losing anything of value. That assumption still lingers.Riyadh
P
2

The git submodule deinit that I documented in 2013 and its associated rm -rf .git/modules/a/submodule both assume the removed submodule was already pushed.

Submodules were initially introduced to be used as read-only, in order to get other repository content into your repository, without necessarily the intent of modifying them.
This differs from subtree, where modifications are more naturally expected.

That being said, yes, if you remove a submodule while having not committed/pushed local changes to said submodule, the end result won't be satisfactory.

A possible patch idea would be to block/fail the git submodule deinit command when it detects that the submodule current HEAD does not match its own internal remote tracking branch (its own origin/master for instance)

Python answered 22/5, 2020 at 21:3 Comment(11)
Thanks. I ran into this issue where I previously added a third-party library into my project as a submodule so I could make local modifications as needed by the parent project. I didn't use subtree because I saw that solution for own modules, not mods of third party modules (I may be wrong on that). I've since realized node's package.json supports local unpublished modules so now do that instead (see https://mcmap.net/q/22440/-package-manager-vs-git-submodule-subtree). But I still want to be able to go back to old commit that use the abandoned solution.Stillness
@Stillness I know Christophe's old 2015 article well. But a lot has been done on submodule since then.Python
@Stillness The rm step is necessary when doing a deinit. Hence my patch idea, for making Git more robust for your use case.Python
It's necessary because of the possibility of adding a different submodule (different repo) at the same path?Stillness
FYI, I'm happy to delete this new question if it makes sense. But I think the original Q and answers need updates/clarifications.Stillness
This new question is important, and should be debated on the Git mailing list: please leave it opened here.Python
@Stillness yes, the rm step is necessary to ensure a coherent state for the local Git repo (one where the module is no longer referenced)Python
I think we may be talking about two different rm steps. rm -rf .git/modules/a/submodule vs git rm -f a/submodule? Skipping the former I don't believe results in any incoherency. Won't a checkout of past commits make use of it to restore the worktree of the submodule?Stillness
@Stillness that would need to be tested: not sure how Git would react if the internel submodule repo is still there.Python
hey @vonc, it's been a year. I've not used submodule since, so haven't thought about this again. Did the debate you mentioned ever happen? I'm inclined to accept your answer at this point.Stillness
@Stillness no debate that I know of for now.Python

© 2022 - 2024 — McMap. All rights reserved.