What's the equivalent of Subversion's "use-commit-times" for Git?
Asked Answered
W

11

115

I need the timestamps of files on my local system and on my server to be in sync. This is accomplished with Subversion by setting use-commit-times=true in the configuration so that the last modified of each file is when it was committed.

Each time I clone my repository, I want the timestamps of files to reflect when they were last changed in the remote repository, not when I cloned the repository.

Is there a way to do this with Git?

Wirth answered 26/12, 2009 at 21:53 Comment(3)
As part of my deploy process, I upload assets (images, javascript files, and css files) to a CDN. Each filename is appended with the last modified timestamp. It's important I don't expire all my assets each time I deploy. (Another side-effect of use-commit-times is that I can do this process on my local and know my server will refer to the same files, but that's not as important.) If instead of doing a git clone, I did a git fetch followed by a git reset --hard from my remote repo, that would work for a single server, but not for multiple servers since the timestamps on each would be diff.Wirth
@BenW: git annex might be useful to keep track of imagesRecede
You can check what's changed by checking id's. You're trying to make filesystem timestamps mean the same thing as vcs timestamps. They don't mean the same thing.Scotopia
V
31

I am not sure this would be appropriate for a DVCS (as in "Distributed" VCS)

The huge discussion had already took place in 2007 (see this thread)

And some of Linus's answer were not too keen on the idea. Here is one sample:

I'm sorry. If you don't see how it's WRONG to set a datestamp back to something that will make a simple "make" miscompile your source tree, I don't know what defintiion of "wrong" you are talking about.
It's WRONG.
It's STUPID.
And it's totally INFEASIBLE to implement.


(Note: small improvement: after a checkout, timestamps of up-to-date files are no longer modified (Git 2.2.2+, January 2015): "git checkout - how can I maintain timestamps when switching branches?".)


The long answer was:

I think you're much better off just using multiple repositories instead, if this is something common.

Messing with timestamps is not going to work in general. It's just going to guarantee you that "make" gets confused in a really bad way, and does not recompile enough instead of recompiling too much.

Git does make it possible to do your "check the other branch out" thing very easily, in many different ways.

You could create some trivial script that does any of the following (ranging from the trivial to the more exotic):

  • just create a new repo:

      git clone old new
      cd new
      git checkout origin/<branch>
    

and there you are. The old timestamps are fine in your old repo, and you can work (and compile) in the new one, without affecting the old one at all.

Use the flags "-n -l -s" to "git clone" to basically make this instantaneous. For lots of files (eg big repos like the kernel), it's not going to be as fast as just switching branches, but having a second copy of the working tree can be quite powerful.

  • do the same thing with just a tar-ball instead, if you want to

      git archive --format=tar --prefix=new-tree/ <branchname> |
              (cd .. ; tar xvf -)
    

which is really quite fast, if you just want a snapshot.

  • get used to "git show", and just look at individual files.
    This is actually really useful at times. You just do

      git show otherbranch:filename
    

in one xterm window, and look at the same file in your current branch in another window.

In particular, this should be trivial to do with scriptable editors (ie GNU emacs), where it should be possible to basically have a whole "dired mode" for other branches within the editor, using this.
For all I know, the emacs git mode already offers something like this (I'm not an emacs user)

  • and in the extreme example of that "virtual directory" thing, there was at least somebody working on a git plugin for FUSE, ie you could literally just have virtual directories showing all your branches.

and I'm sure any of the above are better alternatives than playing games with file timestamps.

Linus

Volgograd answered 26/12, 2009 at 21:54 Comment(11)
Agreed. You shouldn't be confusing a DVCS with a distribution system. git is a DVCS, for manipulating source code that will be built into your final product. If you want a distribution system, you know where to find rsync.Hair
Hm, I'll have to trust his argument that it's infeasible. Whether it's wrong or stupid though is another matter. I version my files using a timestamp and upload them to a CDN, which is why it's important that the timestamps reflect when the file was actually modified, not when it was last pulled down from the repo.Wirth
Does anyone know how else I might preserve last modified times of my files if they're in git repo?Wirth
@Ben W: the "Linus's answer" is not here to say it is wrong in your particular situation. It is there only as a reminder that a DVCS is not well-suited for that kind of feature (timestamp preserving).Volgograd
@VonC: Since other modern DVCS like Bazaar and Mercurial handle timestamps just fine, I'd rather say that "git is not well-suited for that kind of feature". If "a" DVCS should have that feature is debatable (and I strongly think they do).Fanni
@Fanni my point precisely. Mercurial and Bazzar cannot handle timestamp any better, since they are distributed. Mercurial would require an extension or a hook like HG_TIMESTAMP_UPDATE to store and manage that information. As for bazaar, see answers.launchpad.net/bzr/+question/94116#comment-4, which illustrates Linus' point about "Messing with timestamps is not going to work in general. It's just going to guarantee you that "make" gets confused in a really bad way, and does not recompile enough instead of recompiling too much."Volgograd
This is not an answer to the question, but a philosophical discussion about the merits of doing this in a version control system. If the person would have liked that, they would have asked, "What is the reason git doesn't use the commit time for the modified time of files?"Orfield
@thomasfuchs: when someone says "how can I do X" and X is logically infeasible, they usually like to know why and not just be told so, or have their question ignored. In other words, this is a perfectly legitimate answer.Strawworm
Linus's approach of multiple checkouts works even better today, using git's worktrees feature: git-scm.com/docs/git-worktree. I used to use multiple checkouts, setting each as a remote on the others so I could easily push/pull changes. Worktrees make that workflow a lot smoother. Also, Linus is correct. Very few build systems use hashes for dependency-keyed caching of intermediate results.Convolution
Also, one of the things that makes git fast is that it does NOT use file hashes to detect changes in the worktrees. It only hashes when it sees that a file has changed via the modification date being different than what it has recorded for it in the index. Hashes are not terribly expensive if you are reading the file anyway, but they are a horribly expensive way to look for potential changes in a large tree. AFAIK, the only time Git computes a file hash in the normal workflow is when you do git add. Detecting changes to files (not repos) is what modification dates are actually FOR!Convolution
@BobKerns True, I presented the concept of git worktree here in 2015. As for hashes, this is now more graph and graph island: "Updates to the Git Commit Graph Feature", Derrick Stolee, that I presented here starting 2018, and mentioned here in 2020, in relation with git log.Volgograd
F
112

UPDATE: My solution is now packaged into Debian, Ubuntu, Linux Mint, Fedora, Gentoo Linux, Arch Linux and possibly other distributions:

https://github.com/MestreLion/git-tools#install

apt install git-restore-mtime  # Debian, Ubuntu, Linux Mint
yum install git-tools          # Fedora, Red Hat Enterprise Linux (RHEL), CentOS
emerge dev-vcs/git-tools       # Gentoo Linux
pacman -S git-tools-git        # Arch Linux

IMHO, not storing timestamps (and other metadata like permissions and ownership) is a big limitation of Git.

Linus' rationale of timestamps being harmful just because it "confuses make" is lame:

  • make clean is enough to fix any problems.

  • Applies only to projects that use make, mostly C/C++. It is completely moot for scripts like Python, Perl, or documentation in general.

  • There is only harm if you apply the timestamps. There would be no harm in storing them in repo. Applying them could be a simple --with-timestamps option for git checkout and friends (clone, pull, etc.), at the user's discretion.

Both Bazaar and Mercurial stores metadata. Users can apply them or not when checking out. But in Git, since original timestamps are not even available in the repository, there is no such option.

So, for a very small gain (not having to re-compile everything) that is specific to a subset of projects, Git as a general DVCS was crippled, some information from about files is lost, and, as Linus said, it's infeasible to do it now. Sad.

That said, may I offer two approaches?

1 - http://repo.or.cz/w/metastore.git , by David Härdeman. It tries to do what Git should have done in the first place: stores metadata (not only timestamps) in the repository when committing (via a pre-commit hook), and reapplies them when pulling (also via hooks).

2 - My humble version of a script I used before for generating release tarballs. As mentioned in other answers, the approach is a little different: to apply for each file the timestamp of the most recent commit where the file was modified.

  • git-restore-mtime, with lots of options, supports any repository layout, and runs on Python 3.

Below is a really bare-bones version of the script, as a proof-of-concept, on Python 2.7. For actual usage I strongly recommend the full version above:

#!/usr/bin/env python
# Bare-bones version. Current directory must be top-level of work tree.
# Usage: git-restore-mtime-bare [pathspecs...]
# By default update all files
# Example: to only update only the README and files in ./doc:
# git-restore-mtime-bare README doc

import subprocess, shlex
import sys, os.path

filelist = set()
for path in (sys.argv[1:] or [os.path.curdir]):
    if os.path.isfile(path) or os.path.islink(path):
        filelist.add(os.path.relpath(path))
    elif os.path.isdir(path):
        for root, subdirs, files in os.walk(path):
            if '.git' in subdirs:
                subdirs.remove('.git')
            for file in files:
                filelist.add(os.path.relpath(os.path.join(root, file)))

mtime = 0
gitobj = subprocess.Popen(shlex.split('git whatchanged --pretty=%at'),
                          stdout=subprocess.PIPE)
for line in gitobj.stdout:
    line = line.strip()
    if not line: continue

    if line.startswith(':'):
        file = line.split('\t')[-1]
        if file in filelist:
            filelist.remove(file)
            #print mtime, file
            os.utime(file, (mtime, mtime))
    else:
        mtime = long(line)

    # All files done?
    if not filelist:
        break

Performance is pretty impressive, even for monster projects wine, git or even the Linux kernel:

Bash
# 0.27 seconds
# 5,750 log lines processed
# 62 commits evaluated
# 1,155 updated files

Git
# 3.71 seconds
# 96,702 log lines processed
# 24,217 commits evaluated
# 2,495 updated files

Wine
# 13.53 seconds
# 443,979 log lines processed
# 91,703 commits evaluated
# 6,005 updated files

Linux kernel
# 59.11 seconds
# 1,484,567 log lines processed
# 313,164 commits evaluated
# 40,902 updated files
Fanni answered 8/11, 2012 at 7:13 Comment(5)
Works on Ubuntu 20. Nice hack. Thanks.Govea
To use it, cd to the top level of the working directory and git restore-mtime. If you are in a subdirectory, it will only update what's below.Facility
@Liam: this is by design, as most (if not all) recursive file operations in git also operates on current directory and below.Fanni
note for archers: there this tool is packaged which seems to do essentially the same thing: github.com/alerque/git-warp-timeBaskett
@Baskett thanks for pointing out that project! But git-restore-mtime is also available in Arch as git-tools-gitFanni
D
96

If, however you really want to use commit times for timestamps when checking out then try using this script and place it (as executable) in the file $GIT_DIR/.git/hooks/post-checkout:

#!/bin/sh -e

OS=${OS:-`uname`}
old_rev="$1"
new_rev="$2"

get_file_rev() {
    git rev-list -n 1 "$new_rev" "$1"
}

if   [ "$OS" = 'Linux' ]
then
    update_file_timestamp() {
        file_time=`git show --pretty=format:%ai --abbrev-commit "$(get_file_rev "$1")" | head -n 1`
        touch -d "$file_time" "$1"
    }
elif [ "$OS" = 'FreeBSD' ]
then
    update_file_timestamp() {
        file_time=`date -r "$(git show --pretty=format:%at --abbrev-commit "$(get_file_rev "$1")" | head -n 1)" '+%Y%m%d%H%M.%S'`
        touch -h -t "$file_time" "$1"
    }
else
    echo "timestamp changing not implemented" >&2
    exit 1
fi

IFS=`printf '\t\n\t'`

git ls-files | while read -r file
do
    update_file_timestamp "$file"
done

Note however, that this script will cause quite a large delay for checking out large repositories (where large means large amount of files, not large file sizes).

Delocalize answered 10/1, 2010 at 21:46 Comment(8)
+1 for an actual answer, rather than just saying "Don't do that"Raddy
Many thanks Giel, this is working brilliantly (I actually ported this into my site deployment script, see additional answer below)Soapstone
| head -n 1 should be avoided as it spawns a new process, -n 1 for git rev-list and git log can be used instead.Crowder
It's better NOT to read lines with `...` and for; see Why you don't read lines with "for". I'd go for git ls-files -z and while IFS= read -r -d ''.Micelle
Is a Windows version possible?Mannie
@Mannie with Cygwin this should already work, with Git and a bash shell it might by accepting the current code for "$OS" = 'Linux', otherwise it will probably require a different approach for updating time stamps on Windows, of which I'm not aware how to do (I rarely use Windows).Delocalize
instead of git show --pretty=format:%ai --abbrev-commit "$(get_file_rev "$1")" | head -n 1 you can do git show --pretty=format:%ai -s "$(get_file_rev "$1")", it causes a lot less data to be generated by the show command and should reduce overhead.Micropathology
@Mannie Windows version is among the answers nowCarycaryatid
V
31

I am not sure this would be appropriate for a DVCS (as in "Distributed" VCS)

The huge discussion had already took place in 2007 (see this thread)

And some of Linus's answer were not too keen on the idea. Here is one sample:

I'm sorry. If you don't see how it's WRONG to set a datestamp back to something that will make a simple "make" miscompile your source tree, I don't know what defintiion of "wrong" you are talking about.
It's WRONG.
It's STUPID.
And it's totally INFEASIBLE to implement.


(Note: small improvement: after a checkout, timestamps of up-to-date files are no longer modified (Git 2.2.2+, January 2015): "git checkout - how can I maintain timestamps when switching branches?".)


The long answer was:

I think you're much better off just using multiple repositories instead, if this is something common.

Messing with timestamps is not going to work in general. It's just going to guarantee you that "make" gets confused in a really bad way, and does not recompile enough instead of recompiling too much.

Git does make it possible to do your "check the other branch out" thing very easily, in many different ways.

You could create some trivial script that does any of the following (ranging from the trivial to the more exotic):

  • just create a new repo:

      git clone old new
      cd new
      git checkout origin/<branch>
    

and there you are. The old timestamps are fine in your old repo, and you can work (and compile) in the new one, without affecting the old one at all.

Use the flags "-n -l -s" to "git clone" to basically make this instantaneous. For lots of files (eg big repos like the kernel), it's not going to be as fast as just switching branches, but having a second copy of the working tree can be quite powerful.

  • do the same thing with just a tar-ball instead, if you want to

      git archive --format=tar --prefix=new-tree/ <branchname> |
              (cd .. ; tar xvf -)
    

which is really quite fast, if you just want a snapshot.

  • get used to "git show", and just look at individual files.
    This is actually really useful at times. You just do

      git show otherbranch:filename
    

in one xterm window, and look at the same file in your current branch in another window.

In particular, this should be trivial to do with scriptable editors (ie GNU emacs), where it should be possible to basically have a whole "dired mode" for other branches within the editor, using this.
For all I know, the emacs git mode already offers something like this (I'm not an emacs user)

  • and in the extreme example of that "virtual directory" thing, there was at least somebody working on a git plugin for FUSE, ie you could literally just have virtual directories showing all your branches.

and I'm sure any of the above are better alternatives than playing games with file timestamps.

Linus

Volgograd answered 26/12, 2009 at 21:54 Comment(11)
Agreed. You shouldn't be confusing a DVCS with a distribution system. git is a DVCS, for manipulating source code that will be built into your final product. If you want a distribution system, you know where to find rsync.Hair
Hm, I'll have to trust his argument that it's infeasible. Whether it's wrong or stupid though is another matter. I version my files using a timestamp and upload them to a CDN, which is why it's important that the timestamps reflect when the file was actually modified, not when it was last pulled down from the repo.Wirth
Does anyone know how else I might preserve last modified times of my files if they're in git repo?Wirth
@Ben W: the "Linus's answer" is not here to say it is wrong in your particular situation. It is there only as a reminder that a DVCS is not well-suited for that kind of feature (timestamp preserving).Volgograd
@VonC: Since other modern DVCS like Bazaar and Mercurial handle timestamps just fine, I'd rather say that "git is not well-suited for that kind of feature". If "a" DVCS should have that feature is debatable (and I strongly think they do).Fanni
@Fanni my point precisely. Mercurial and Bazzar cannot handle timestamp any better, since they are distributed. Mercurial would require an extension or a hook like HG_TIMESTAMP_UPDATE to store and manage that information. As for bazaar, see answers.launchpad.net/bzr/+question/94116#comment-4, which illustrates Linus' point about "Messing with timestamps is not going to work in general. It's just going to guarantee you that "make" gets confused in a really bad way, and does not recompile enough instead of recompiling too much."Volgograd
This is not an answer to the question, but a philosophical discussion about the merits of doing this in a version control system. If the person would have liked that, they would have asked, "What is the reason git doesn't use the commit time for the modified time of files?"Orfield
@thomasfuchs: when someone says "how can I do X" and X is logically infeasible, they usually like to know why and not just be told so, or have their question ignored. In other words, this is a perfectly legitimate answer.Strawworm
Linus's approach of multiple checkouts works even better today, using git's worktrees feature: git-scm.com/docs/git-worktree. I used to use multiple checkouts, setting each as a remote on the others so I could easily push/pull changes. Worktrees make that workflow a lot smoother. Also, Linus is correct. Very few build systems use hashes for dependency-keyed caching of intermediate results.Convolution
Also, one of the things that makes git fast is that it does NOT use file hashes to detect changes in the worktrees. It only hashes when it sees that a file has changed via the modification date being different than what it has recorded for it in the index. Hashes are not terribly expensive if you are reading the file anyway, but they are a horribly expensive way to look for potential changes in a large tree. AFAIK, the only time Git computes a file hash in the normal workflow is when you do git add. Detecting changes to files (not repos) is what modification dates are actually FOR!Convolution
@BobKerns True, I presented the concept of git worktree here in 2015. As for hashes, this is now more graph and graph island: "Updates to the Git Commit Graph Feature", Derrick Stolee, that I presented here starting 2018, and mentioned here in 2020, in relation with git log.Volgograd
S
13

I took Giel's answer and instead of using a post-commit hook script, worked it into my custom deployment script.

Update: I've also removed one | head -n following @eregon's suggestion, and added support for files with spaces in them:

# Adapted to use HEAD rather than the new commit ref
get_file_rev() {
    git rev-list -n 1 HEAD "$1"
}

# Same as Giel's answer above
update_file_timestamp() {
    file_time=`git show --pretty=format:%ai --abbrev-commit "$(get_file_rev "$1")" | head -n 1`
    sudo touch -d "$file_time" "$1"
}

# Loop through and fix timestamps on all files in our CDN directory
old_ifs=$IFS
IFS=$'\n' # Support files with spaces in them
for file in $(git ls-files | grep "$cdn_dir")
do
    update_file_timestamp "${file}"
done
IFS=$old_ifs
Soapstone answered 3/4, 2011 at 19:12 Comment(3)
Thanks Daniel, that's helpful to knowSoapstone
the --abbrev-commit is superfluous in git show command due --pretty=format:%ai being used (commit hash isn't part of output) and | head -n 1 could be replaced with using -s flag to git showSusan
@DanielS.Sterling: %ai is author date, ISO 8601 like format, for strict iso8601 use %aI: git-scm.com/docs/git-showSusan
L
6

We were forced to invent yet another solution, because we needed specifically modification times and not commit times, and the solution also had to be portable (i.e., getting Python working in Windows's Git installations really is not a simple task) and fast. It resembles the David Hardeman's solution, which I decided not to use because of lack of documentation (from the repository I was not able to get idea what exactly his code does).

This solution stores mtimes in a file .mtimes in the Git repository, updates them accordingly on commits (just selectively the mtimes of staged files) and applies them on checkout. It works even with Cygwin/MinGW versions of Git (but you may need to copy some files from standard Cygwin into Git's folder)

The solution consists of three files:

  1. mtimestore - core script providing three options, -a (save all - for initialization in already existing repo (works with git-versed files)), -s (to save staged changes), and -r to restore them. This actually comes in two versions - a Bash one (portable, nice, easy to read/modify), and C version (messy one but fast, because MinGW Bash is horribly slow which makes impossible to use the Bash solution on big projects).
  2. pre-commit hook
  3. post-checkout hook

Pre-commit:

#!/bin/bash
mtimestore -s
git add .mtimes

Post-checkout

#!/bin/bash
mtimestore -r

mtimestore - Bash:

#!/bin/bash

function usage
{
  echo "Usage: mtimestore (-a|-s|-r)"
  echo "Option  Meaning"
  echo " -a save-all - saves state of all files in a git repository"
  echo " -s save - saves mtime of all staged files of git repository"
  echo " -r restore - touches all files saved in .mtimes file"
  exit 1
}

function echodate
{
  echo "$(stat -c %Y "$1")|$1" >> .mtimes
}

IFS=$'\n'

while getopts ":sar" optname
do
  case "$optname" in
    "s")
      echo "saving changes of staged files to file .mtimes"
      if [ -f .mtimes ]
      then
        mv .mtimes .mtimes_tmp
        pattern=".mtimes"
        for str in $(git diff --name-only --staged)
        do
          pattern="$pattern\|$str"
        done
        cat .mtimes_tmp | grep -vh "|\($pattern\)\b" >> .mtimes
      else
        echo "warning: file .mtimes does not exist - creating new"
      fi

      for str in $(git diff --name-only --staged)
      do
        echodate "$str"
      done
      rm .mtimes_tmp 2> /dev/null
      ;;
    "a")
      echo "saving mtimes of all files to file .mtimes"
      rm .mtimes 2> /dev/null
      for str in $(git ls-files)
      do
        echodate "$str"
      done
      ;;
    "r")
      echo "restorim dates from .mtimes"
      if [ -f .mtimes ]
      then
        cat .mtimes | while read line
        do
          timestamp=$(date -d "1970-01-01 ${line%|*} sec GMT" +%Y%m%d%H%M.%S)
          touch -t $timestamp "${line##*|}"
        done
      else
        echo "warning: .mtimes not found"
      fi
      ;;
    ":")
      usage
      ;;
    *)
      usage
      ;;
esac

mtimestore - C++

#include <time.h>
#include <utime.h>
#include <sys/stat.h>
#include <iostream>
#include <cstdlib>
#include <fstream>
#include <string>
#include <cerrno>
#include <cstring>
#include <sys/types.h>
#include <ctime>
#include <map>


void changedate(int time, const char* filename)
{
  try
  {
    struct utimbuf new_times;
    struct stat foo;
    stat(filename, &foo);

    new_times.actime = foo.st_atime;
    new_times.modtime = time;
    utime(filename, &new_times);
  }
  catch(...)
  {}
}

bool parsenum(int& num, char*& ptr)
{
  num = 0;
  if(!isdigit(*ptr))
    return false;
  while(isdigit(*ptr))
  {
    num = num*10 + (int)(*ptr) - 48;
    ptr++;
  }
  return true;
}

// Splits the line into a numeral and text part - returns the numeral into 'time' and set 'ptr' to the position where the filename starts
bool parseline(const char* line, int& time, char*& ptr)
{
  if(*line == '\n' || *line == '\r')
    return false;
  time = 0;
  ptr = (char*)line;
  if(parsenum(time, ptr))
  {
    ptr++;
    return true;
  }
  else
    return false;
}

// Replace \r and \n (otherwise is interpreted as part of filename)
void trim(char* string)
{
  char* ptr = string;
  while(*ptr != '\0')
  {
    if(*ptr == '\n' || *ptr == '\r')
      *ptr = '\0';
    ptr++;
  }
}


void help()
{
  std::cout << "version: 1.4" << std::endl;
  std::cout << "usage: mtimestore <switch>" << std::endl;
  std::cout << "options:" << std::endl;
  std::cout << "  -a  saves mtimes of all git-versed files into .mtimes file (meant to be done on intialization of mtime fixes)" << std::endl;
  std::cout << "  -s  saves mtimes of modified staged files into .mtimes file(meant to be put into pre-commit hook)" << std::endl;
  std::cout << "  -r  restores mtimes from .mtimes file (that is meant to be stored in repository server-side and to be called in post-checkout hook)" << std::endl;
  std::cout << "  -h  show this help" << std::endl;
}

void load_file(const char* file, std::map<std::string, int>& mapa)
{

  std::string line;
  std::ifstream myfile (file, std::ifstream::in);

  if(myfile.is_open())
  {
      while (myfile.good())
      {
        getline (myfile, line);
        int time;
        char* ptr;
        if(parseline(line.c_str(), time, ptr))
        {
          if(std::string(ptr) != std::string(".mtimes"))
            mapa[std::string(ptr)] = time;
        }
      }
    myfile.close();
  }

}

void update(std::map<std::string, int>& mapa, bool all)
{
  char path[2048];
  FILE *fp;
  if(all)
    fp = popen("git ls-files", "r");
  else
    fp = popen("git diff --name-only --staged", "r");

  while(fgets(path, 2048, fp) != NULL)
  {
    trim(path);
    struct stat foo;
    int err = stat(path, &foo);
    if(std::string(path) != std::string(".mtimes"))
      mapa[std::string(path)] = foo.st_mtime;
  }
}

void write(const char * file, std::map<std::string, int>& mapa)
{
  std::ofstream outputfile;
  outputfile.open(".mtimes", std::ios::out);
  for(std::map<std::string, int>::iterator itr = mapa.begin(); itr != mapa.end(); ++itr)
  {
    if(*(itr->first.c_str()) != '\0')
    {
      outputfile << itr->second << "|" << itr->first << std::endl;
    }
  }
  outputfile.close();
}

int main(int argc, char *argv[])
{
  if(argc >= 2 && argv[1][0] == '-')
  {
    switch(argv[1][1])
    {
      case 'r':
        {
          std::cout << "restoring modification dates" << std::endl;
          std::string line;
          std::ifstream myfile(".mtimes");
          if (myfile.is_open())
          {
            while (myfile.good())
            {
              getline (myfile, line);
              int time, time2;
              char* ptr;
              parseline(line.c_str(), time, ptr);
              changedate(time, ptr);
            }
            myfile.close();
          }
        }
        break;

      case 'a':
      case 's':
        {
          std::cout << "saving modification times" << std::endl;

          std::map<std::string, int> mapa;
          load_file(".mtimes", mapa);
          update(mapa, argv[1][1] == 'a');
          write(".mtimes", mapa);
        }
        break;

      default:
        help();
        return 0;
    }
  }
  else
  {
    help();
    return 0;
  }

  return 0;
}
  • note that hooks can be placed into template-directory to automatize their placement

More information may be found on kareltucek / git-mtime-extension

//edit - C++ version updated:

  • Now the C++ version maintains alphabetical ordering → fewer merge conflicts.
  • Got rid of the ugly system() calls.
  • Deleted $git update-index --refresh$ from post-checkout hook. It causes some problems with revert under TortoiseGit, and does not seem to be much important anyway.
  • Our Windows (by now ages old and probably nonfunctional) package can be downloaded at http://ktweb.cz/blog/download/git-mtimestore-1.4.rar

//edit see GitHub for an up-to-date version

Lynnalynne answered 8/9, 2013 at 8:57 Comment(3)
Note that after a checkout, timestamps of up-to-date files are no longer modified (Git 2.2.2+, January 2015): https://mcmap.net/q/13029/-39-git-checkout-39-how-can-i-maintain-timestamps-when-switching-branchesVolgograd
The first www.ktweb.cz link is broken ("Hmm. We’re having trouble finding that site. We can’t connect to the server at www.ktweb.cz."). But strangely the second link does work. Trouble with "www."? Though dropping it and using http://ktweb.cz/blog/index.php?page=page&id=116 results in "403 Forbidden".Adulteress
Indeed, the web is no longer running. I have removed the link. The github is the relevant source now - feel free to fire tickets or PRs there.Lynnalynne
V
3

The following script incorporates the -n 1 and HEAD suggestions, works in most non-Linux environments (like Cygwin), and can be run on a checkout after the fact:

#!/bin/bash -e

OS=${OS:-`uname`}

get_file_rev() {
    git rev-list -n 1 HEAD "$1"
}    

if [ "$OS" = 'FreeBSD' ]
then
    update_file_timestamp() {
        file_time=`date -r "$(git show --pretty=format:%at --abbrev-commit "$(get_file_rev "$1")" | head -n 1)" '+%Y%m%d%H%M.%S'`
        touch -h -t "$file_time" "$1"
    }    
else    
    update_file_timestamp() {
        file_time=`git show --pretty=format:%ai --abbrev-commit "$(get_file_rev "$1")" | head -n 1`
        touch -d "$file_time" "$1"
    }    
fi    

OLD_IFS=$IFS
IFS=$'\n'

for file in `git ls-files`
do
    update_file_timestamp "$file"
done

IFS=$OLD_IFS

git update-index --refresh

Assuming you named the above script /path/to/templates/hooks/post-checkout and/or /path/to/templates/hooks/post-update, you can run it on an existing repository via:

git clone git://path/to/repository.git
cd repository
/path/to/templates/hooks/post-checkout
Vanvanadate answered 18/7, 2012 at 14:32 Comment(2)
It needs one more last line: git update-index --refresh // GUI tools might rely upon index and show "dirty"status to all the file after such operation. Namely that happens in TortoiseGit for Windows code.google.com/p/tortoisegit/issues/detail?id=861Malaguena
And thanks for script. I wish such script was part of Git standard installer. Not that i need it personally, but team members just feel timestamp refreashing as a red "stop" banner in VCS adoption.Malaguena
C
3

I saw some requests for a Windows version, so here it is. Create the following two files:

C:\Program Files\Git\mingw64\share\git-core\templates\hooks\post-checkout

#!C:/Program\ Files/Git/usr/bin/sh.exe
exec powershell.exe -NoProfile -ExecutionPolicy Bypass -File "./$0.ps1"

C:\Program Files\Git\mingw64\share\git-core\templates\hooks\post-checkout.ps1

[string[]]$changes = &git whatchanged --pretty=%at
$mtime = [DateTime]::Now;
[string]$change = $null;
foreach($change in $changes)
{
    if($change.Length -eq 0) { continue; }
    if($change[0] -eq ":")
    {
        $parts = $change.Split("`t");
        $file = $parts[$parts.Length - 1];
        if([System.IO.File]::Exists($file))
        {
            [System.IO.File]::SetLastWriteTimeUtc($file, $mtime);
        }
    }
    else
    {
        #get timestamp
        $mtime = [DateTimeOffset]::FromUnixTimeSeconds([Int64]::Parse($change)).DateTime;
    }
}

This utilizes git whatchanged, so it runs through all the files in one pass instead of calling git for each file.

Carycaryatid answered 23/5, 2019 at 17:30 Comment(0)
P
2

This solution should run pretty quickly. It sets atimes to committer times and mtimes to author times. It uses no modules so should be reasonably portable.

#!/usr/bin/perl

# git-utimes: update file times to last commit on them
# Tom Christiansen <[email protected]>

use v5.10;      # for pipe open on a list
use strict;
use warnings;
use constant DEBUG => !!$ENV{DEBUG};

my @gitlog = ( 
    qw[git log --name-only], 
    qq[--format=format:"%s" %ct %at], 
    @ARGV,
);

open(GITLOG, "-|", @gitlog)             || die "$0: Cannot open pipe from `@gitlog`: $!\n";

our $Oops = 0;
our %Seen;
$/ = ""; 

while (<GITLOG>) {
    next if /^"Merge branch/;

    s/^"(.*)" //                        || die;
    my $msg = $1; 

    s/^(\d+) (\d+)\n//gm                || die;
    my @times = ($1, $2);               # last one, others are merges

    for my $file (split /\R/) {         # I'll kill you if you put vertical whitespace in our paths
        next if $Seen{$file}++;             
        next if !-f $file;              # no longer here

        printf "atime=%s mtime=%s %s -- %s\n", 
                (map { scalar localtime $_ } @times), 
                $file, $msg,
                                        if DEBUG;

        unless (utime @times, $file) {
            print STDERR "$0: Couldn't reset utimes on $file: $!\n";
            $Oops++;
        }   
    }   

}
exit $Oops;
Punic answered 28/8, 2016 at 15:55 Comment(0)
C
2

Here is a Go program:

import "bufio"
import "log"
import "os/exec"

func check(e error) {
   if e != nil {
      log.Fatal(e)
   }
}

func popen(name string, arg ...string) (*bufio.Scanner, error) {
   cmd := exec.Command(name, arg...)
   pipe, e := cmd.StdoutPipe()
   if e != nil {
      return nil, e
   }
   return bufio.NewScanner(pipe), cmd.Start()
}
import "os"
import "strconv"
import "time"

func main() {
   gitLs, e := popen("git", "ls-files")
   check(e)
   files := map[string]bool{}
   for gitLs.Scan() {
      files[gitLs.Text()] = true
   }
   gitLog, e := popen(
      "git", "log", "-m",
      "--name-only", "--relative", "--pretty=format:%ct", ".",
   )
   check(e)
   for len(files) > 0 {
      gitLog.Scan()
      sec, e := strconv.ParseInt(gitLog.Text(), 10, 64)
      check(e)
      unix := time.Unix(sec, 0)
      for gitLog.Scan() {
         name := gitLog.Text()
         if name == "" {
            break
         }
         if ! files[name] {
            continue
         }
         os.Chtimes(name, unix, unix)
         delete(files, name)
      }
   }
}

It is similar to this answer. It builds up a file list like that answer, but it builds from git ls-files instead of just looking in the working directory. This solves the problem of excluding .git, and it also solves the problem of untracked files. Also, that answer fails if the last commit of a file was a merge commit, which I solved with git log -m. Like the other answer, will stop once all files are found, so it doesn't have to read all the commits.

For example with git/git, as of this posting it only had to read 182 commits. Also it ignores old files from the history as needed, and it won't touch a file that has already been touched. Finally, it is faster than the other solution. Results with git/git repo:

PS C:\git> Measure-Command { ..\git-touch }
Milliseconds      : 470
Clutch answered 1/7, 2020 at 18:38 Comment(0)
F
1

I have found git utimes in git-extras.

Facility answered 7/1, 2022 at 15:31 Comment(0)
C
1

Old question, but still current. Setting to the date of the last commit is on interpretation.

A one-liner will do it, which you can put into a hook script.

find "$(git rev-parse --show-toplevel)" \
  -name .git -prune -o -print0 \
| xargs -0 -L 8 -P 0 \
  touch --date="$(git log -n 1 --format='%cI' HEAD)"

If your desired interpretation is the last time a non-null change to the file was committed, we'll need a helper script to look up the date for each file.

#toucher.sh

for f in "$@"; do
  date=$(git log -n 1 "$f")"
  # skip files that have never been checked in.
  if [ -n "$date" ]; then
    touch --date="$date" "$f"
  if
done

And call it from the main script:

find "$(git rev-parse --show-toplevel)"  -name .git -prune -o -print0 | xargs -0 -L 8 -P 0 bash -c toucher.sh

Your main script can write this out to a temp file if you want to package it as a single file.

This will, of course, be rather slow for large repositories, especially if they have many files last modified deep in history.

A more performant approach would be to capture this information on checkin, in a pre-commit hook, adding a .filedates file filled with alternating paths and dates. Then a post-checkout hook applies the dates to the files.

The post-checkout hook can verify that the commit includes the .filedates file. If not, the information will be out of date, and it can be disregarded and recreated. This can happen if someone does not set up their hook scripts (or bypasses them). You can minimize that by checking in a server hook for non-empty commits without .filedates in the tree.

But if you have any sort of build system that rebuilds when files change, I urge you not to do this.

Build systems (and git itself) rely on file modification dates changing when files change, and source files being older than build results.

Switching branches can change out new files for old, causing the build system to incorrectly think it is up-to-date. A build system would have to, at a minimum, record the modification dates of every file in the depdendency tree, for each build artifact.

This isn't really good enough, because file dates generally only have 1 second resolution. Hashes could detect these, but you'd have to hash every artifact, and compare for every artifact with a matching modification time.

I don't know of any build system that does this. Your build system probably doesn't.

Instead, I recommend using the git-worktree facility, building each branch in its own worktree. This allows file modification dates to work as they were intended (to note when files are modified).

If you must still reset the dates by one of these schemes, at least arrange to clear out your build artifacts on any pull that changes files to an earlier date.

It is amazing how much time you can waste debugging a problem that was caused by your build system assuming it was up-to-date.

Convolution answered 11/4, 2023 at 23:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.