Checking out all files in sub directory

Asked 2/7, 2016 at 22:26 Answered 7/8, 2016 at 20:25

I'm trying to checkout all files and subdirectories in a specific directory "code/app" but git gives me code/app the directory plus its contents. I just want its contents. I would like to use this in a post-receive git hook.

git checkout -f master -- code/app

I also tried the following to no avail

git checkout -f master -- code/app/*

git checkout -f master -- code/app/.

How can I get the expected behavior stated above?

Update:

post-receive

#!/usr/bin/env ruby
# post-receive

# Read STDIN
from, to, branch = ARGF.read.split " "

if (branch =~ /master$/)
        puts "Received branch #{branch}, deploying."

        # Copy files to deploy directory
        deploy_to_dir = File.expand_path('../development')
        dir_to_checkout = '06\ -\ Code/app/'
        `GIT_WORK_TREE="#{deploy_to_dir}" git checkout -f master -- #{dir_to_checkout}`
        puts "DEPLOY: master(#{to}) copied to '#{deploy_to_dir}'"

        exit
end

# Only deploy if pre-production branch was pushed
if (branch =~ /pre-production$/)
        puts "Received branch #{branch}, deploying."

        # Copy files to deploy directory
        deploy_to_dir = File.expand_path('../staging')
        dir_to_checkout = '06\ -\ Code/app/'
        `GIT_WORK_TREE="#{deploy_to_dir}" git checkout -f pre-production -- #{dir_to_checkout}`
        puts "DEPLOY: pre-production(#{to}) copied to '#{deploy_to_dir}'"

        exit
end

if (branch =~ /production$/)
        puts "Received branch #{branch}, deploying."

        # Copy files to deploy directory
        deploy_to_dir = File.expand_path('../')
        dir_to_checkout = '06\ -\ Code/app/'
        `GIT_WORK_TREE="#{deploy_to_dir}" git checkout -f production -- #{dir_to_checkout}`
        puts "DEPLOY: production(#{to}) copied to '#{deploy_to_dir}'"

        exit
end

Folks answered 2/7, 2016 at 22:26 Comment(6)

Are you sure checkout is the correct command for this? checkout is "to switch branches or restore working tree files" – Parsonage 2/7, 2016 at 22:29

when you checkout a file, you wont ever get back the lost changes, are you sure you wanna do that ? – Tovatovar 2/7, 2016 at 22:32

@EliSadoff I'm working a deployment strategy. So the idea is to deploy files in different directories depending on the branch pushed to the server. – Folks 2/7, 2016 at 22:33

@YehiaAwad that should be fine because everything is tested on a local machine and pushed the desired behavior is needed on the server only. – Folks 2/7, 2016 at 22:34

@EliSadoff see updated question. – Folks 2/7, 2016 at 22:37

@YehiaAwad see updated question – Folks 2/7, 2016 at 22:38

@torek, @jthill I've learnt quite a bit from your posts. There are great ideas in there. I, however, chose to go with PM2 which does this fairly well from a client machine. I've also found shipit as another alternative the problem illustrated in my question.

Folks answered 7/8, 2016 at 20:25 Comment(0)

I'm not a Ruby expert, but fortunately that does not seem to matter here: I can read what your code is supposed to do (though not say whether it has any Ruby-specific issues).

If I may rephrase your question a bit, you are saying:

The git checkout command mostly does what I want:
git checkout -f <branch> -- <dir-path>
e.g.
git checkout -f master -- Code/app/
extracts all the files that are named Code/app/somedir/somefile. The part I don't like is that it puts them in files named Code/app/somedir/somefile when I want them to just be in somedir/somefile.

There are ways to rename the files, but none of them are simple and easy, at least not compared to just doing the checkout and then renaming the files. So that—doing the checkout, then renaming the files—is probably what you should do.

There are more subtle problems here though.

First, this particular form of git checkout simply adds or replaces files in the work-tree. Suppose that in one of your deployment branches (master, pre-production, and production), you remove some unwanted file(s). The deployment script will add or update any added or updated files, but will not remove files, so removed files will not be removed from your deployment area.

(You may want this behavior for some files. Making it happen for "just the right files" could be tricky.)

Second, your scripts appear to deploy any branch whose name ends with these three names. That might not be a problem in practice: who is going to name their branch fred/master? But if someone does, you might do the wrong thing, unless you would like to deploy such a branch. It may be better to check that the reference name is literally refs/heads/master, refs/heads/pre-production, and/or refs/heads/production.

Last ... well, this one is complicated. Remember that Git has this thing called the index or staging area (or sometimes cache): when you git add files, you are copying them into the index, and when you git commit, Git takes a snapshot of the index contents, which becomes the new commit. One of the implications here, which I think not enough Git documentation emphasizes, is that the index always contains every file that will be in the next commit. (And after making a commit, the index is not "empty", as several commands and bits of documentation imply: instead, it is full of everything that you just committed.)

The git checkout command makes use of this fact, and it may cause you grief.

You are setting the variable GIT_WORK_TREE before these different checkout commands, but you are not setting GIT_INDEX_FILE. Git will therefore use the index file—the one single standard file, the same file each time—for each of these different checkout commands.

Why the index file matters for checkout

Git tries hard to optimize the checkout. Some of this is just for speed, and some is because a "switch branches" checkout deliberately does not clobber work-tree changes, letting you carry them across to the new branch in case that was what you intended. (See Git - checkout another branch when there are uncommitted changes on the current branch for details.) Either way, Git uses this trick:

If the index entry says the correct file is already in place, and the file in the work-tree is not newer than the index time-stamp, don't touch the work-tree file.

That is, suppose we're pretending to be Git, and we are doing git checkout $branch (with or without -- Code/app or whatever: with and without behave somewhat differently, but for this next case we can use one test). We turn the branch name $branch into a commit ID and turn that into a tree ID and start looking through the tree and its sub-trees. We find a file named Code/app/xyz.rb. We look into the index/cache, and see that it has an entry for Code/app/xyz.rb.

The index's entry says "blob a123456... was installed as Code/app/xyz.rb with time stamp 1467505093 (Sat Jul 2 17:18:13 PDT 2016). We check work-tree file Code/app/xyz.rb. Its time stamp, obtained via an lstat call, is 1467500000 (about 3:33 PM on the same day, i.e., a few hours earlier). Well, that's odd, it went back in time, but clearly it hasn't been changed since we last checked it out—so let's just leave it in place!

But hang on, how did we get into this situation in the first place? Well, about two hours ago we were run to deploy pre-production. We (remember, "we" here is us-being-Git-checkout) actually did copy Git blob a123456... to work-tree Code/app/xyz.rb at that time. But $GIT_WORK_TREE then was "/some/path/to/preprod/area", and now it's "/some/path/to/master/area".

We must have deployed master at some point even earlier than these two, so that the file exists in the current work-tree. But when we did that, the tree for master said to use Git blob 01234567 rather than a123456. Then, two hours ago, we were asked to deploy pre-production and that particular Code/app/xyz.rb was updated to version a123456. Now we're being asked to re-deploy master and its Code/app/xyz.rb should also be a123456, but we see we already deployed version a123456—it's in the index!—and the work-tree version has not been edited since then, so it must be fine.

(It's not fine, that was a different work tree. But we don't know that: the index does not record the path of the work tree; we're assuming that $GIT_WORK_TREE now is the same as $GIT_WORK_TREE then.)

There are several approaches that can work here:

Remove the work tree, so that Git must re-deploy every file.

This method has the merit of being simple. It has the drawback of making Git copy out every file. Of course, if you're going to rename a sub-directory of the work-tree, this "just works" anyway and is probably a good way to go.
Remove the index (cache), so that Git must re-deploy every file.

This is almost the same as before. Git will rebuild the index.
Use one index per work-tree.

This is the most complicated method, but should be the most efficient. Moreover, it's what the new (since Git version 2.5) git worktree add command does, and git worktree add will hide all the complexity. You can replace this somewhat complicated deployment script with the new worktree feature, provided your Git is new enough.

Synopsis answered 3/7, 2016 at 0:39 Comment(0)

Just do the underlying operation and don't bother updating (or even checking) HEAD:

git read-tree -um `git write-tree` master:code/app

bypasses the convenience commands and does what you want directly. You're not checking out a development branch, there's no commit for what you're checking out. git write-tree spits the tree for what's listed in the index now, master:code/app names that tree in the master commit, two-tree git read-tree -um does the transition in the index and worktree.

If you're maintaining multiple worktrees, maintain an index for each of them by exporting GIT_INDEX_FILE as something besides "$GIT_DIR/index".

Karisakarissa answered 3/7, 2016 at 4:58 Comment(0)

Folks answered 7/8, 2016 at 20:25 Comment(0)

Why the index file matters for checkout

Recommended topics

Hot tags