What is the difference in Git between being in a directory (cd) and using the `work-tree` parameter?
Asked Answered
P

2

5

I am using PHP to run Git in PHP's exec to get some information about some Git projects for a set of server dashboards. I have encountered some strange output, which makes me wonder if I misunderstand what the "working tree" is.

If I use this command, replacing the %s sprintf parameter with the path to the Git project:

git --work-tree=%s status

Then I get this output:

On branch server-dashboard
Your branch is ahead of 'origin/server-dashboard' by 1 commit.
  (use "git push" to publish your local commits)

Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   .gitignore
    deleted:    README.md
    deleted:    common.php
    deleted:    conf/dev/settings.ini
    deleted:    conf/prod/settings.ini
    deleted:    conf/settings.ini.example
    deleted:    lib/HealthSettings.php
    deleted:    lib/Settings.php
    deleted:    public/assets/main.css
    deleted:    public/assets/main.js
    deleted:    public/assets/refresh.gif
    deleted:    public/assets/spinner.gif
    deleted:    public/curl-test.php
    deleted:    public/dashboard.php
    deleted:    public/iframes.php
    modified:   public/index.php
    deleted:    public/info.php
    deleted:    public/no-servers.php
    deleted:    public/sections/apache-mods.php
    deleted:    public/sections/curl-headers.php
    deleted:    public/sections/curl-self.php
    deleted:    public/sections/database.php
    deleted:    public/sections/env.php
    deleted:    public/sections/git-table.php
    deleted:    public/sections/git.php
    deleted:    public/sections/php-exts.php
    deleted:    public/sections/php-proxy.php
    deleted:    public/sections/tableau-proxy-table.php
    deleted:    public/sections/tableau-proxy.php
    deleted:    public/sections/user.php
    deleted:    public/tabs.php

That is not what I expect, since it is not what I get if I run git status on the console.

Now, if I run this command in PHP (again swapping the string param for a path):

cd %s && git status

then I get the correct output:

On branch master
Your branch is ahead of 'origin/master' by 17 commits.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean

I am guessing that since this project is 17 commits behind master, the "wrong" output in fact is an expression of what has changed over those commits. However, I would have thought the two commands would be equivalent. Am I wrong, and if so, can I run a Git command programmatically without having to cd first?

Panaggio answered 24/7, 2018 at 14:10 Comment(0)
W
5

There are a number of substantive differences between cd path; git command and git --work-tree=path command. Some or all of these differences can be made to vanish depending on additional parameters and/or environment variables.

It's important to realize that Git has three (not just two) key items that it must work with at almost all times. These are:

  • The repository itself (the repo database of name-to-hash-ID pairs, such as master representing commit e3331758f12da22f4103eec7efe1b5304a9be5e9 or whatever other hash ID, plus the object whose ID is that big ugly hash ID string). The repository typically lives in a directory named .git at the top level of the work-tree. This is the git directory ($GIT_DIR).

  • The index, which indexes and caches (hence its two names index or sometimes cache) the work-tree, and acts as a storage location (hence its third name, staging area) for updated files (really, pathname to blob hash ID translations) when you intend to build a new commit. The index is mostly a file: .git/index. As you can see from this path name, by default, the index file lives within the repository. However, it has its own separate control variable, $GIT_INDEX_FILE. It simply defaults to $GIT_DIR/index.

  • The work-tree holds files in their uncompressed format. Files make their way into the work-tree by being extracted from a commit into the index, and then from the index (where they're still compressed and in Git-only format) into the work-tree. The work-tree may also hold additional files that are not found in the index. Such files are unstaged. An unstaged file may or may not be ignored (a staged file, i.e., one whose pathname appears in the index, is by definition never ignored).

The work-tree is normally just the current working directory, or derived from the current working directory by walking upwards (.., then ../.., and so on) to find the first place that contains a .git repository directory. This means that cd path; git ... searches for the work-tree starting from wherever you have landed.

If there are no overrides, having found the work-tree and hence the .git directory, Git now knows where $GIT_DIR is and where to find the index file. But if you provide an override, using git --work-tree=path or by setting the environment variable $GIT_WORK_TREE, Git will look there for the work-tree, and look in the current directory (or .. and then ../.. and so on) for the repository directory.

If you provide a --git-dir=path override, or set the environment variable $GIT_DIR, Git will look there for the repository directory, regardless of any setting or lack of setting for the work-tree.

(Note: --git-dir and --work-tree are actually implemented by having the git front end set the environment variables. Hence if you set both, the flag argument overrides the environment setting for the duration of the Git command, including any subprocesses that Git itself runs.)

IF you provide a $GIT_INDEX_FILE override via the environment, Git will look there for the index file, regardless of any setting or lack of setting for $GIT_DIR.

Any of these settings can be an absolute path—starting with / on Unix-like systems, or using a drive letter on sillier systems—or a relative path. An absolute path overrides the current working directory, while a relative path starts from the current working directory.

Hence the exact contents of any of these arguments or environment variables matter a great deal. For instance, running:

cd $HOME/foo; GIT_INDEX_FILE=$HOME/index git --git-dir=sub/.git --work-tree=/tmp ...

will cause Git to look for the repository in $HOME/foo/sub/.git, the index file in $HOME/index, and the work-tree in /tmp.

Besides all of this, the front-end git command allows a -C argument, or multiple -C arguments. Each of these makes Git execute a cd to the supplied path. Hence the above is largely equivalent to:

GIT_INDEX_FILE=$HOME/index git -C $HOME/foo --git-dir=sub/.git --work-tree=/tmp ...

except that once the above command terminates, your shell / command-interpreter remains in whatever working directory it had before you ran the command. (These details vary slightly on Windows, I believe, since Windows makes some very strange assumptions about "current directory" vs "current drive letter", or something along those lines.)


In your specific case—running git status—be aware that git status does two separate comparisons:

  • First, it compares (a la git diff) the current commit (found via $GIT_DIR/HEAD) to the contents of $GIT_INDEX_FILE. Whatever is different here is staged for commit.
  • Then it compares (again a la git diff) the contents of the index file to the contents of $GIT_WORK_TREE. Whatever is different here is not staged for commit.

The ahead and/or behind counts come from an earlier step, where git status uses the current branch (again from $GIT_DIR/HEAD). Normally, this HEAD file is a symbolic reference, containing the name of the current branch. Git can then find that branch's upstream setting (git rev-parse --symbolic-full-name $branch@{upstream}, more or less, though --abbrev-ref is more suitable for humans): master typically has refs/remotes/origin/master as its upstream. Git then reads through the commit graph, extracted from $GIT_DIR, to discover how many commits ahead and/or behind master is vs origin/master.


The above is not the whole story. Consult the git front end command documentation to find a more complete list of environment variables that can be used to override specific items. For instance, GIT_ALTERNATE_OBJECT_DIRECTORIES can be used to make Git look outside the repository itself for additional object storage locations, while GIT_CEILING_DIRECTORIES can be used to limit the amount of path-walking Git does when looking through .., ../.., and so on.

Wernick answered 24/7, 2018 at 17:26 Comment(4)
This is a marvellous answer, for which many thanks. I've been meaning to peek at some Git internals just to improve my understanding of data structures, and this is a great start.Panaggio
However, my one remaining confusion is: since I am using the Git frontend, it strikes me that the index and the Git dir could be derived from the work tree, since the .git folder's default position is in the root of the work tree. Why does the front end not derive these automatically? I wonder if it is using its current cd path to (wrongly) fill in the gaps. Since I am running this command in LAMP, I'd guess the cwd is the location of the script I am running.Panaggio
One thing to look out for is whether $GIT_DIR is already set, which it is in (most? all?) hook scripts. If it's not set, then yes, Git does the search-for-.git based on current-working-directory.Wernick
Right. So if my cwd is set to something essentially random, then specifying the work tree via a parameter is not enough. Thanks - I am relieved that Git is not broken, even on Windows ;-).Panaggio
P
1

The other answer complicates things with a discussion of environment variables that will never be set in practice when running a php server script and it doesn't clearly answer which option solves the problem. Let's try:

$ cd
$ ls -ad .git
ls: cannot access '.git': No such file or directory

$ env | grep -i git | wc -l
0

$ git --work-tree=$HOME/myrepo describe --always
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

$ git --work-tree=$HOME/myrepo --git-dir=$HOME/myrepo/.git describe --always
d5cd316

$ git --git-dir=$HOME/myrepo/.git describe --always
d5cd316

$ git --git-dir=$HOME/myrepo describe --always
fatal: not a git repository: '/home/redacted/myrepo'

--> --git-dir is enough but it needs the /.git suffix.

Pythagoras answered 27/9, 2020 at 15:21 Comment(1)
Thanks for this, it's nice to have several answers that tackle a problem in different ways. I am technically still working on the project for which I needed this, on and off, so maybe I will have cause to use this useful advice!Panaggio

© 2022 - 2024 — McMap. All rights reserved.