Convert .gitmodules into a parsable format for iteration using Bash
Asked Answered
U

2

8

Background

I would like to make a shell function that takes .gitmodules and iterates over each module executing certain commands based off of each submodules properties (e.g. <PATH> or <URL> or <BRANCH>).

➡️ The default format of .gitmodules:

[submodule "PATH"]
    path = <PATH>
    url = <URL>
[submodule "PATH"]
    path = <PATH>
    url = <URL>
    branch = <BRANCH>

➡️ Pseudocode:

def install_modules() {
    modules = new list

    fill each index of the modules list with each submodule & its properties

    iteratate over modules
       if module @ 'path' contains a specified 'branch':
          git submodule add -b 'branch' 'url' 'path'
       else:
          git submodule add 'url' 'path'
}

⚠️ Current install_modules()

# currently works for grabbing the first line of the file
# doesn't work for each line after.
install_modules() {
    declare -A regex

    regex["module"]='\[submodule "(.*)"\]'
    regex["url"]='url = "(.*)"'
    regex["branch"]='branch = "(.*)"'

    # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

    cat < ".gitmodules" | while read -r LINE; do
        if [[ $LINE =~ ${regex[module]} ]]; then
            PATH=${BASH_REMATCH[1]}

            echo "$PATH"
        fi
    done
}
Uzzi answered 22/12, 2018 at 15:25 Comment(1)
This is not a duplicate but looks relevant: How do I grab an INI value within a shell script?Mellophone
U
4

With a little help from @phd and Restore git submodules from .gitmodules (which @phd pointed me towards), I was able to construct the function that I needed.

install_submodules()

⚠️ Note: Assume $REPO_PATH is declared & initialized.

⚠️ My answer is an adaptation from https://mcmap.net/q/210345/-restore-git-submodules-from-gitmodules.

install_submodules() {
    git -C "${REPO_PATH}" config -f .gitmodules --get-regexp '^submodule\..*\.path$' |
        while read -r KEY MODULE_PATH
        do
            # If the module's path exists, remove it.
            # This is done b/c the module's path is currently 
            # not a valid git repo and adding the submodule will cause an error.
            [ -d "${MODULE_PATH}" ] && sudo rm -rf "${MODULE_PATH}"

            NAME="$(echo "${KEY}" | sed 's/^submodule\.\(.*\)\.path$/\1/')"

            url_key="$(echo "${KEY}" | sed 's/\.path$/.url/')"
            branch_key="$(echo "${KEY}" | sed 's/\.path$/.branch/')"

            URL="$(git config -f .gitmodules --get "${url_key}")"
            BRANCH="$(git config -f .gitmodules --get "${branch_key}" || echo "master")"

            git -C "${REPO_PATH}" submodule add --force -b "${BRANCH}" --name "${NAME}" "${URL}" "${MODULE_PATH}" || continue
        done

    git -C "${REPO_PATH}" submodule update --init --recursive
}
Uzzi answered 22/12, 2018 at 21:33 Comment(2)
Recommended minor tweak: in the sed expressions, add an anchor (path$ rather than just path) so that an Evil Submodule whose path is, say, x.path won't be translated into NAME=x but rather into NAME=x.path.Jory
Nice script. Thank you! For me I was missing the "-C "${REPO_PATH}"" part in when setting the URL and BRANCH variables. Those commands return nothing when the working directory is not inside the REPO_PATHXimenes
S
8

.gitmodules is a .gitconfig-like file so you can use git config to read it. For example, read all values from a .gitmodules, split values by = (key=value), and split keys by .:

git config -f .gitmodules -l | awk '{split($0, a, /=/); split(a[1], b, /\./); print b[1], b[2], b[3], a[2]}'

git config -f .gitmodules -l prints something like

submodule.native/inotify_simple.path=native/inotify_simple
submodule.native/inotify_simple.url=https://github.com/chrisjbillington/inotify_simple

and awk output would be

submodule native/inotify_simple path native/inotify_simple
submodule native/inotify_simple url https://github.com/chrisjbillington/inotify_simple
Scrannel answered 22/12, 2018 at 16:32 Comment(7)
Thank you for your response! How would I modify to read from the file itself without using git config?Uzzi
I don't know. I don't see a reason to avoid git config and awk is not my preferred language. Would I want to parse .gitmodules without git config I'd use a different language — Perl or Python — and a library to parse .gitconfig-like modules: https://mcmap.net/q/1326536/-git-config-style-configuration-system/7976758Scrannel
Okay. My use case is when I add .gitmodules to a locally initializedgit repo that doesn't actually contain a .gitmodules file. So, I would like to read from the file iteratively executing git submodule add for each submodule within the file. I know its odd, but its for a reason.Uzzi
I don't see a reason to avoid git config in this case. You have git anyway, you gonna run git submodule, why not git config? You probably need to name the file differently so that git submodule add wouldn't clobber it.Scrannel
The reason is that my locally initialized git repo doesn't have git submodules. So 'git config -f .gitmodules -l' will return nothing.Uzzi
git config -f can read any file. You said you're going to use an existing .gitmodulesScrannel
Also worth mentioning: git config can read a config file by its blob hash ID or other suitable revision string, directly from a Git repository. Git grew this ability long ago precisely because Git needs to read submodule information from .gitmodules files that are not necessarily extracted to the file system. Hence, you can do GIT_DIR=$path_to_repo/.git git config --blob <whatever> --get .... The documentation suggests using, e.g., --blob master:.gitmodules.Jory
U
4

With a little help from @phd and Restore git submodules from .gitmodules (which @phd pointed me towards), I was able to construct the function that I needed.

install_submodules()

⚠️ Note: Assume $REPO_PATH is declared & initialized.

⚠️ My answer is an adaptation from https://mcmap.net/q/210345/-restore-git-submodules-from-gitmodules.

install_submodules() {
    git -C "${REPO_PATH}" config -f .gitmodules --get-regexp '^submodule\..*\.path$' |
        while read -r KEY MODULE_PATH
        do
            # If the module's path exists, remove it.
            # This is done b/c the module's path is currently 
            # not a valid git repo and adding the submodule will cause an error.
            [ -d "${MODULE_PATH}" ] && sudo rm -rf "${MODULE_PATH}"

            NAME="$(echo "${KEY}" | sed 's/^submodule\.\(.*\)\.path$/\1/')"

            url_key="$(echo "${KEY}" | sed 's/\.path$/.url/')"
            branch_key="$(echo "${KEY}" | sed 's/\.path$/.branch/')"

            URL="$(git config -f .gitmodules --get "${url_key}")"
            BRANCH="$(git config -f .gitmodules --get "${branch_key}" || echo "master")"

            git -C "${REPO_PATH}" submodule add --force -b "${BRANCH}" --name "${NAME}" "${URL}" "${MODULE_PATH}" || continue
        done

    git -C "${REPO_PATH}" submodule update --init --recursive
}
Uzzi answered 22/12, 2018 at 21:33 Comment(2)
Recommended minor tweak: in the sed expressions, add an anchor (path$ rather than just path) so that an Evil Submodule whose path is, say, x.path won't be translated into NAME=x but rather into NAME=x.path.Jory
Nice script. Thank you! For me I was missing the "-C "${REPO_PATH}"" part in when setting the URL and BRANCH variables. Those commands return nothing when the working directory is not inside the REPO_PATHXimenes

© 2022 - 2024 — McMap. All rights reserved.