What's the equivalent of Subversion's "use-commit-times" for Git?

W

11

115

I need the timestamps of files on my local system and on my server to be in sync. This is accomplished with Subversion by setting use-commit-times=true in the configuration so that the last modified of each file is when it was committed.

Each time I clone my repository, I want the timestamps of files to reflect when they were last changed in the remote repository, not when I cloned the repository.

Is there a way to do this with Git?

Wirth answered 26/12, 2009 at 21:53 Comment(3)

As part of my deploy process, I upload assets (images, javascript files, and css files) to a CDN. Each filename is appended with the last modified timestamp. It's important I don't expire all my assets each time I deploy. (Another side-effect of use-commit-times is that I can do this process on my local and know my server will refer to the same files, but that's not as important.) If instead of doing a git clone, I did a git fetch followed by a git reset --hard from my remote repo, that would work for a single server, but not for multiple servers since the timestamps on each would be diff. – Wirth 26/12, 2009 at 22:46

@BenW: git annex might be useful to keep track of images – Recede 12/8, 2012 at 12:44

You can check what's changed by checking id's. You're trying to make filesystem timestamps mean the same thing as vcs timestamps. They don't mean the same thing. – Scotopia 3/7, 2020 at 19:50

V

31

I am not sure this would be appropriate for a DVCS (as in "Distributed" VCS)

The huge discussion had already took place in 2007 (see this thread)

And some of Linus's answer were not too keen on the idea. Here is one sample:

I'm sorry. If you don't see how it's WRONG to set a datestamp back to something that will make a simple "make" miscompile your source tree, I don't know what defintiion of "wrong" you are talking about.
It's WRONG.
It's STUPID.
And it's totally INFEASIBLE to implement.

(Note: small improvement: after a checkout, timestamps of up-to-date files are no longer modified (Git 2.2.2+, January 2015): "git checkout - how can I maintain timestamps when switching branches?".)

The long answer was:

I think you're much better off just using multiple repositories instead, if this is something common.

Messing with timestamps is not going to work in general. It's just going to guarantee you that "make" gets confused in a really bad way, and does not recompile enough instead of recompiling too much.

Git does make it possible to do your "check the other branch out" thing very easily, in many different ways.

You could create some trivial script that does any of the following (ranging from the trivial to the more exotic):
just create a new repo:
  git clone old new
  cd new
  git checkout origin/<branch>
and there you are. The old timestamps are fine in your old repo, and you can work (and compile) in the new one, without affecting the old one at all.

Use the flags "-n -l -s" to "git clone" to basically make this instantaneous. For lots of files (eg big repos like the kernel), it's not going to be as fast as just switching branches, but having a second copy of the working tree can be quite powerful.
do the same thing with just a tar-ball instead, if you want to
  git archive --format=tar --prefix=new-tree/ <branchname> |
          (cd .. ; tar xvf -)
which is really quite fast, if you just want a snapshot.
get used to "git show", and just look at individual files.
This is actually really useful at times. You just do
  git show otherbranch:filename
in one xterm window, and look at the same file in your current branch in another window.

In particular, this should be trivial to do with scriptable editors (ie GNU emacs), where it should be possible to basically have a whole "dired mode" for other branches within the editor, using this.
For all I know, the emacs git mode already offers something like this (I'm not an emacs user)

and in the extreme example of that "virtual directory" thing, there was at least somebody working on a git plugin for FUSE, ie you could literally just have virtual directories showing all your branches.

and I'm sure any of the above are better alternatives than playing games with file timestamps.

Linus

Volgograd answered 26/12, 2009 at 21:54 Comment(11)

Agreed. You shouldn't be confusing a DVCS with a distribution system. git is a DVCS, for manipulating source code that will be built into your final product. If you want a distribution system, you know where to find rsync. – Hair 26/12, 2009 at 22:15

Hm, I'll have to trust his argument that it's infeasible. Whether it's wrong or stupid though is another matter. I version my files using a timestamp and upload them to a CDN, which is why it's important that the timestamps reflect when the file was actually modified, not when it was last pulled down from the repo. – Wirth 26/12, 2009 at 22:18

Does anyone know how else I might preserve last modified times of my files if they're in git repo? – Wirth 26/12, 2009 at 22:23

@Ben W: the "Linus's answer" is not here to say it is wrong in your particular situation. It is there only as a reminder that a DVCS is not well-suited for that kind of feature (timestamp preserving). – Volgograd 26/12, 2009 at 22:26

@VonC: Since other modern DVCS like Bazaar and Mercurial handle timestamps just fine, I'd rather say that "git is not well-suited for that kind of feature". If "a" DVCS should have that feature is debatable (and I strongly think they do). – Fanni 26/7, 2013 at 0:21

@Fanni my point precisely. Mercurial and Bazzar cannot handle timestamp any better, since they are distributed. Mercurial would require an extension or a hook like HG_TIMESTAMP_UPDATE to store and manage that information. As for bazaar, see answers.launchpad.net/bzr/+question/94116#comment-4, which illustrates Linus' point about "Messing with timestamps is not going to work in general. It's just going to guarantee you that "make" gets confused in a really bad way, and does not recompile enough instead of recompiling too much." – Volgograd 26/7, 2013 at 5:52

This is not an answer to the question, but a philosophical discussion about the merits of doing this in a version control system. If the person would have liked that, they would have asked, "What is the reason git doesn't use the commit time for the modified time of files?" – Orfield 31/3, 2014 at 12:24

@thomasfuchs: when someone says "how can I do X" and X is logically infeasible, they usually like to know why and not just be told so, or have their question ignored. In other words, this is a perfectly legitimate answer. – Strawworm 22/6, 2019 at 1:2

Linus's approach of multiple checkouts works even better today, using git's worktrees feature: git-scm.com/docs/git-worktree. I used to use multiple checkouts, setting each as a remote on the others so I could easily push/pull changes. Worktrees make that workflow a lot smoother. Also, Linus is correct. Very few build systems use hashes for dependency-keyed caching of intermediate results. – Convolution 11/4, 2023 at 20:0

Also, one of the things that makes git fast is that it does NOT use file hashes to detect changes in the worktrees. It only hashes when it sees that a file has changed via the modification date being different than what it has recorded for it in the index. Hashes are not terribly expensive if you are reading the file anyway, but they are a horribly expensive way to look for potential changes in a large tree. AFAIK, the only time Git computes a file hash in the normal workflow is when you do git add. Detecting changes to files (not repos) is what modification dates are actually FOR! – Convolution 11/4, 2023 at 20:29

@BobKerns True, I presented the concept of git worktree here in 2015. As for hashes, this is now more graph and graph island: "Updates to the Git Commit Graph Feature", Derrick Stolee, that I presented here starting 2018, and mentioned here in 2020, in relation with git log. – Volgograd 11/4, 2023 at 22:11

F

112

UPDATE: My solution is now packaged into Debian, Ubuntu, Linux Mint, Fedora, Gentoo Linux, Arch Linux and possibly other distributions:

https://github.com/MestreLion/git-tools#install

apt install git-restore-mtime  # Debian, Ubuntu, Linux Mint
yum install git-tools          # Fedora, Red Hat Enterprise Linux (RHEL), CentOS
emerge dev-vcs/git-tools       # Gentoo Linux
pacman -S git-tools-git        # Arch Linux

IMHO, not storing timestamps (and other metadata like permissions and ownership) is a big limitation of Git.

Linus' rationale of timestamps being harmful just because it "confuses make" is lame:

make clean is enough to fix any problems.
Applies only to projects that use make, mostly C/C++. It is completely moot for scripts like Python, Perl, or documentation in general.
There is only harm if you apply the timestamps. There would be no harm in storing them in repo. Applying them could be a simple --with-timestamps option for git checkout and friends (clone, pull, etc.), at the user's discretion.

Both Bazaar and Mercurial stores metadata. Users can apply them or not when checking out. But in Git, since original timestamps are not even available in the repository, there is no such option.

So, for a very small gain (not having to re-compile everything) that is specific to a subset of projects, Git as a general DVCS was crippled, some information from about files is lost, and, as Linus said, it's infeasible to do it now. Sad.

That said, may I offer two approaches?

1 - http://repo.or.cz/w/metastore.git , by David Härdeman. It tries to do what Git should have done in the first place: stores metadata (not only timestamps) in the repository when committing (via a pre-commit hook), and reapplies them when pulling (also via hooks).

2 - My humble version of a script I used before for generating release tarballs. As mentioned in other answers, the approach is a little different: to apply for each file the timestamp of the most recent commit where the file was modified.

git-restore-mtime, with lots of options, supports any repository layout, and runs on Python 3.

Below is a really bare-bones version of the script, as a proof-of-concept, on Python 2.7. For actual usage I strongly recommend the full version above:

#!/usr/bin/env python
# Bare-bones version. Current directory must be top-level of work tree.
# Usage: git-restore-mtime-bare [pathspecs...]
# By default update all files
# Example: to only update only the README and files in ./doc:
# git-restore-mtime-bare README doc

import subprocess, shlex
import sys, os.path

filelist = set()
for path in (sys.argv[1:] or [os.path.curdir]):
    if os.path.isfile(path) or os.path.islink(path):
        filelist.add(os.path.relpath(path))
    elif os.path.isdir(path):
        for root, subdirs, files in os.walk(path):
            if '.git' in subdirs:
                subdirs.remove('.git')
            for file in files:
                filelist.add(os.path.relpath(os.path.join(root, file)))

mtime = 0
gitobj = subprocess.Popen(shlex.split('git whatchanged --pretty=%at'),
                          stdout=subprocess.PIPE)
for line in gitobj.stdout:
    line = line.strip()
    if not line: continue

    if line.startswith(':'):
        file = line.split('\t')[-1]
        if file in filelist:
            filelist.remove(file)
            #print mtime, file
            os.utime(file, (mtime, mtime))
    else:
        mtime = long(line)

    # All files done?
    if not filelist:
        break

Performance is pretty impressive, even for monster projects wine, git or even the Linux kernel:

Bash
# 0.27 seconds
# 5,750 log lines processed
# 62 commits evaluated
# 1,155 updated files

Git
# 3.71 seconds
# 96,702 log lines processed
# 24,217 commits evaluated
# 2,495 updated files

Wine
# 13.53 seconds
# 443,979 log lines processed
# 91,703 commits evaluated
# 6,005 updated files

Linux kernel
# 59.11 seconds
# 1,484,567 log lines processed
# 313,164 commits evaluated
# 40,902 updated files

Fanni answered 8/11, 2012 at 7:13 Comment(5)

Works on Ubuntu 20. Nice hack. Thanks. – Govea 3/5, 2021 at 12:47

To use it, cd to the top level of the working directory and git restore-mtime. If you are in a subdirectory, it will only update what's below. – Facility 20/12, 2021 at 15:12

@Liam: this is by design, as most (if not all) recursive file operations in git also operates on current directory and below. – Fanni 7/1, 2022 at 6:10

note for archers: there this tool is packaged which seems to do essentially the same thing: github.com/alerque/git-warp-time – Baskett 6/9, 2023 at 6:12

@Baskett thanks for pointing out that project! But git-restore-mtime is also available in Arch as git-tools-git – Fanni 6/9, 2023 at 7:52

D

96

If, however you really want to use commit times for timestamps when checking out then try using this script and place it (as executable) in the file $GIT_DIR/.git/hooks/post-checkout:

#!/bin/sh -e

OS=${OS:-`uname`}
old_rev="$1"
new_rev="$2"

get_file_rev() {
    git rev-list -n 1 "$new_rev" "$1"
}

if   [ "$OS" = 'Linux' ]
then
    update_file_timestamp() {
        file_time=`git show --pretty=format:%ai --abbrev-commit "$(get_file_rev "$1")" | head -n 1`
        touch -d "$file_time" "$1"
    }
elif [ "$OS" = 'FreeBSD' ]
then
    update_file_timestamp() {
        file_time=`date -r "$(git show --pretty=format:%at --abbrev-commit "$(get_file_rev "$1")" | head -n 1)" '+%Y%m%d%H%M.%S'`
        touch -h -t "$file_time" "$1"
    }
else
    echo "timestamp changing not implemented" >&2
    exit 1
fi

IFS=`printf '\t\n\t'`

git ls-files | while read -r file
do
    update_file_timestamp "$file"
done

Note however, that this script will cause quite a large delay for checking out large repositories (where large means large amount of files, not large file sizes).

Delocalize answered 10/1, 2010 at 21:46 Comment(8)

+1 for an actual answer, rather than just saying "Don't do that" – Raddy 10/11, 2010 at 11:23

Many thanks Giel, this is working brilliantly (I actually ported this into my site deployment script, see additional answer below) – Soapstone 3/4, 2011 at 19:7

| head -n 1 should be avoided as it spawns a new process, -n 1 for git rev-list and git log can be used instead. – Crowder 25/3, 2012 at 12:46

It's better NOT to read lines with `...` and for; see Why you don't read lines with "for". I'd go for git ls-files -z and while IFS= read -r -d ''. – Micelle 8/3, 2013 at 10:28

Is a Windows version possible? – Mannie 4/4, 2015 at 0:14

@Mannie with Cygwin this should already work, with Git and a bash shell it might by accepting the current code for "$OS" = 'Linux', otherwise it will probably require a different approach for updating time stamps on Windows, of which I'm not aware how to do (I rarely use Windows). – Delocalize 9/4, 2015 at 9:27

instead of git show --pretty=format:%ai --abbrev-commit "$(get_file_rev "$1")" | head -n 1 you can do git show --pretty=format:%ai -s "$(get_file_rev "$1")", it causes a lot less data to be generated by the show command and should reduce overhead. – Micropathology 27/10, 2016 at 14:33

@Mannie Windows version is among the answers now – Carycaryatid 24/5, 2019 at 16:28