UPDATE: My solution is now packaged into Debian, Ubuntu, Linux Mint, Fedora, Gentoo Linux, Arch Linux and possibly other distributions:
https://github.com/MestreLion/git-tools#install
apt install git-restore-mtime # Debian, Ubuntu, Linux Mint
yum install git-tools # Fedora, Red Hat Enterprise Linux (RHEL), CentOS
emerge dev-vcs/git-tools # Gentoo Linux
pacman -S git-tools-git # Arch Linux
IMHO, not storing timestamps (and other metadata like permissions and ownership) is a big limitation of Git.
Linus' rationale of timestamps being harmful just because it "confuses make
" is lame:
make clean
is enough to fix any problems.
Applies only to projects that use make
, mostly C/C++. It is completely moot for scripts like Python, Perl, or documentation in general.
There is only harm if you apply the timestamps. There would be no harm in storing them in repo. Applying them could be a simple --with-timestamps
option for git checkout
and friends (clone
, pull
, etc.), at the user's discretion.
Both Bazaar and Mercurial stores metadata. Users can apply them or not when checking out. But in Git, since original timestamps are not even available in the repository, there is no such option.
So, for a very small gain (not having to re-compile everything) that is specific to a subset of projects, Git as a general DVCS was crippled, some information from about files is lost, and, as Linus said, it's infeasible to do it now. Sad.
That said, may I offer two approaches?
1 - http://repo.or.cz/w/metastore.git , by David Härdeman. It tries to do what Git should have done in the first place: stores metadata (not only timestamps) in the repository when committing (via a pre-commit hook), and reapplies them when pulling (also via hooks).
2 - My humble version of a script I used before for generating release tarballs. As mentioned in other answers, the approach is a little different: to apply for each file the timestamp of the most recent commit where the file was modified.
- git-restore-mtime, with lots of options, supports any repository layout, and runs on Python 3.
Below is a really bare-bones version of the script, as a proof-of-concept, on Python 2.7. For actual usage I strongly recommend the full version above:
#!/usr/bin/env python
# Bare-bones version. Current directory must be top-level of work tree.
# Usage: git-restore-mtime-bare [pathspecs...]
# By default update all files
# Example: to only update only the README and files in ./doc:
# git-restore-mtime-bare README doc
import subprocess, shlex
import sys, os.path
filelist = set()
for path in (sys.argv[1:] or [os.path.curdir]):
if os.path.isfile(path) or os.path.islink(path):
filelist.add(os.path.relpath(path))
elif os.path.isdir(path):
for root, subdirs, files in os.walk(path):
if '.git' in subdirs:
subdirs.remove('.git')
for file in files:
filelist.add(os.path.relpath(os.path.join(root, file)))
mtime = 0
gitobj = subprocess.Popen(shlex.split('git whatchanged --pretty=%at'),
stdout=subprocess.PIPE)
for line in gitobj.stdout:
line = line.strip()
if not line: continue
if line.startswith(':'):
file = line.split('\t')[-1]
if file in filelist:
filelist.remove(file)
#print mtime, file
os.utime(file, (mtime, mtime))
else:
mtime = long(line)
# All files done?
if not filelist:
break
Performance is pretty impressive, even for monster projects wine
, git
or even the Linux kernel:
Bash
# 0.27 seconds
# 5,750 log lines processed
# 62 commits evaluated
# 1,155 updated files
Git
# 3.71 seconds
# 96,702 log lines processed
# 24,217 commits evaluated
# 2,495 updated files
Wine
# 13.53 seconds
# 443,979 log lines processed
# 91,703 commits evaluated
# 6,005 updated files
Linux kernel
# 59.11 seconds
# 1,484,567 log lines processed
# 313,164 commits evaluated
# 40,902 updated files
git annex
might be useful to keep track of images – Recede