What is the workflow to clean git submodules on clones?
Asked Answered
H

4

24

There are useful answers how to remove submodules "locally" - How do I remove a submodule?

However I have the issue I have clones of my repo on several machines. Two I only work once or twice a month. So when I update the branches on those although the submodules are no longer tracked the files are still on the working branch as well in .git/modules. On more than one occasion I accidentally checked some in. In other cases I have builds failing due to the presence of these unwanted files/directories.

I suppose I could keep a list of stuff to delete - but that doesn't seem right - how about another person would do the removal and I don't have info outside git what to remove?

So what is the suggested way to clean up clones?

update

For me git --version returns git version 2.18.0 which seems to be current (it is 2018-09-01).

I add a reproducible example.

Setup

mkdir parent
cd parent 
git init
git submodule add https://github.com/jakesgordon/javascript-tetris.git
git commit -am add

cd ..
git clone parent clone
cd clone
git submodule update --init

Now both directories contain the submodule with checked out files in javascript-tetris/. .gitmodules contains the submodule.

When I do

cd parent
git rm javascript-tetris
git commit -am delete

In parent the directory is javascript-tetris gone and the entry in .gitmodules removed. But there remains a populated .git/modules/javascript-tetris directory.

on the clone side:

git pull

gives the warning: warning: unable to rmdir 'javascript-tetris': Directory not empty. And the directory remains, also .git/modules/javascript-tetris is still there.

Harday answered 22/8, 2018 at 9:30 Comment(7)
What versions of git are you using for your primary and clones? I did a little testing with the latest Windows build, and submodules seem to be managed much better now.Photosynthesis
git --version returns git version 2.18.0Harday
Weird, maybe I'm missing something about your clones. My toy example seemed to work. (Don't they always!)Photosynthesis
Could it be you missed the git submodule update --init? That's what I missed in my first tries reproducing the problem.Harday
I found this: medium.com/@porteneuve/mastering-git-submodules-34c65e940407Photosynthesis
Yes, your updated steps reproduce the problem for me. You could post-process the git clean output. Updated my answer to suggest that. Not a great path though.Photosynthesis
Wow this really is a mess - I spent the better part of the afternoon investigating whether git subtree would be a good alternative - and no I don't like it either...Harday
P
11

Is this helpful to you?

git clean -xfd
git submodule foreach --recursive git clean -xfd
git reset --hard
git submodule foreach --recursive git reset --hard
git submodule update --init --recursive

UPDATE

To remove a submodule you need to:

  • Delete the relevant section from the .gitmodules file.
  • Stage the .gitmodules changes git add .gitmodules
  • Delete the relevant section from .git/config.
  • Run git rm --cached path_to_submodule (no trailing slash).
  • Run rm -rf .git/modules/path_to_submodule (no trailing slash).
  • Commit git commit -m "Removed submodule "
  • Delete the now untracked submodule files rm -rf path_to_submodule

    Source

Pourboire answered 3/9, 2018 at 7:33 Comment(2)
Yes and no. git clean was suggested already but I like how you adapted it for nested repos and it seems to thoroughly reset it to the state on the main repo. My main worry is that I wouldn't like to run it on a development machine without checking what it deletes. A secondary issue is that it doesn't clean out the .git/modules directory.Harday
No you misunderstand - this doesn't help as I need to figure out which (several) path_to_submodule my coworker deleted over the last months. This isn't documented as we assumed git submodule would take care of this.Harday
R
7

A bit as in this gist, you can start your working session on those machines with a git clean:

git clean -xfdf

If you want to dry-run first, you can use the -n flag:

git clean -n -xfdf

Use force twice to clean directories with .git subdirectories: git clean -xfdf.
I had some tangling submodules that would not get deleted with just git clean -xfd.

Rossierossing answered 27/8, 2018 at 18:21 Comment(3)
That command is useful but I don't think it is a solution as it does not clean .git/modules. Also since it removes files indiscrimately it feels dangerous to run without carefully checking the repo first.Harday
@Harday Did you try it, for testing?Rossierossing
Yes - I didn't know all the switches (especially the second f confused me). It is definetly solves some of my issues. Still hoping there is a complete solution.Harday
P
3

git rm should handle all the dirty work, but doesn't.

The docs indicate that you'll want to manually cleanup the .git/modules directory. The best option I've found so far builds upon VonC's answer by post-processing the output with sed and xargs:

git clean -xfdf | sed 's/Removing /\.git\/modules\//' |xargs rm -rf

I don't like this option much, as it's quite brittle and depends upon cleanup not having been executed yet. You'd probably be better off verifying that are no local commits in the clone and re-cloning from scratch on the other workstations.

Photosynthesis answered 1/9, 2018 at 7:44 Comment(2)
Weird - I'm also on 2.18, but I'm left with garbage. Since it got longer I added this as reproducible example to the question. Can you confirm we did the same commands?Harday
I'll take a look, sure.Photosynthesis
O
2

Unfortunately, a direct single command to entirely remove a Git Submodule doesn't seem to exist.

See below an extract from the Git Submodules Documentation:

Deleted submodule: A submodule can be deleted by running git rm && git commit. This can be undone using git revert.

The deletion removes the superproject’s tracking data, which are both the gitlink entry and the section in the .gitmodules file. The submodule’s working directory is removed from the file system, but the Git directory is kept around as it to make it possible to checkout past commits without requiring fetching from another repository.

To completely remove a submodule, manually delete $GIT_DIR/modules/{name}/.

However, you can easily create your own git command to execute such tasks as discussed in this answer.

For example, you can create a file with the following bash script:

#!/bin/bash
echo "Running git rm ${1}"
git rm $1
echo "Running rm -rf .git/submodules/${1}"
rm -rf .git/modules/$1
exit 0

Then you'll have to place it in a directory visible in your PATH (for example, in my case, it could be C:\Program Files\Git\cmd) and name it git-{my_command_name} (for example, git-rm-module).

This way, you can use it like git rm-module {my-submodule-name}. And the result would be:

$ git rm-module my-submodule-name
Running git rm-submodule my-submodule-name
Running rm -rf .git/submodules/my-submodule-name

For a complex options, you can get some inspiration from this git activity command.

Opposable answered 1/9, 2018 at 19:13 Comment(1)
Unfortunately my problem is slightly different. basically the issue is when one or more submodules are deleted on the repo on the server and I need to clean up locally (but don't have explicit list what to delete).Harday

© 2022 - 2024 — McMap. All rights reserved.