Git clone changes file modification time
Asked Answered
S

11

73

When I clone a Git repository using the "git clone ..." command, all cloned files in my local repository have the same modification time with date and time as when the git clone command was issued.

Is there a way to clone a remote Git repository with the actual modification time for each file?

Segovia answered 12/2, 2014 at 17:39 Comment(6)
You can get the time of the last modification from git log -n1 -- file; that is what git is for.Florri
I do not quite understand the statement "this is what git is for". Why mod. time is not saved just like in CVS?Segovia
@Amadan: you only get the last commit time, not the last time the file was modified.Bram
@turnt It's not an issue... programs can change the modification times of files they create so it's a choice of the programLangouste
Candidates for the canonical question: What's the equivalent of Subversion's "use-commit-times" for Git? (2009) and Checking out old files WITH original create/modified timestamps (2010). Mercurial has the Timestamp extension (though that does not help much).Sizzler
Does this answer your question? Checking out old files WITH original create/modified timestampsIlleetvilaine
K
45

You can retrieve the last modification date of all files in a Git repository (last commit time). See How to retrieve the last modification date of all files in a Git repository.

Then use the touch command change the modification date:

git ls-tree -r --name-only HEAD | while read filename; do
  unixtime=$(git log -1 --format="%at" -- "${filename}")
  touchtime=$(date -d @$unixtime +'%Y%m%d%H%M.%S')
  touch -t ${touchtime} "${filename}"
done

Also see my gist here.

Kristine answered 10/4, 2019 at 10:10 Comment(7)
Brilliant! Works like a charm. For us, this was critical so we could speed up our Makefile based builds.Adamantine
This is the answer. Except one change, in case of filenames with spaces, you should add quotes around the $filename.Raggletaggle
@P.T. I see quotes have been added nowConcrete
How is this the answer.. You say "You can retrieve the last modification date of all files in a git repository. (lat commit time)" <--- That's the date and time of the commit. That's not the date and time for the individual file. That's not a "last modification date" for the file. That(as far as I can tell), is the date/time for the commit that involved the adding of that file. So if a bunch of files were all added in that commit, your script would give them all the same date/time even if they were days or months or years or hours apart.Concrete
So if somebody did a first commit, on a directory of little ruby scripts that have been written at a variety of times, and they commit it , make some chanegs, and commit again, and push it to a repo, and then they git clone it from on another computer, then it's all same date, and then they run your script, they'll only get a bunch of files stamped with one date/time, and a bunch of files stamped with another date/time, and that's it. Just for the commit dates, that's not the file's last modification date/time at allConcrete
@Concrete git history is not always the history of actual edits, it can be modified by amending and rebase -i. It's a logical history the author chose to present, and in it a commit is a logically atomic, simultaneous change to a set of files. If you want to record "file1 changed before file2", then there must exist a point between those - these must be separate commits.Superdominant
Great command! This demonstrate another lack of intuitiveness & simplicity of GIT. This should be the default behavior when you clone a depository.Broadspectrum
C
43

Git does not record timestamp for the files, since it is a Distributed VCS (meaning the time on your computer can be different from mine: there is no "central" notion of time and date)

The official argument for not recording that metadata is explained in this answer.

But you can find scripts which will attempt to restore a meaningful date, like this one (or a simpler version of the same idea).

Corazoncorban answered 12/2, 2014 at 18:4 Comment(13)
But it could save the local time on the remote end. How do I solve the problem with builds when files are complied based on their modification time?Segovia
@Segovia "it could save the local time on the remote end": that is what the metastore.git approach does, as mentioned in https://mcmap.net/q/12716/-what-39-s-the-equivalent-of-subversion-39-s-quot-use-commit-times-quot-for-git: stores metadata (not only timestamps) in the repo when commiting (via pre-commit hook), and re-applies them when pulling (also via hooks).Corazoncorban
OK, thanks for the reply and suggested solutions. I saw the discussion,but the arguments for not saving mod times because git is version control system do not look strong to me. I used CVS for years and it has this feature and it does not hurt it. Indeed simple ls -ltr command shows you the order of modified files checked out from CVS repository.Segovia
Actually all solutions I saw before do not guaranty that the actual file is is in the remote repository. Git log shows commits but some commits could be still local.Segovia
The fact that builds rely on the modification time of files is actually a reason to not store this as part of the metadata. If the mtime is updated to the time of commit you'd have to start a clean build after checking out an older commit, since files would be considered older than the corresponding derived files and won't cause them to be rebuilt. (Assuming a build system that relies on the mtime, obviously.)Schinica
I am not talking particularly about builds. Look at my comment "... Indeed simple ls -ltr command shows you the order of modified files checked out from CVS repository".Segovia
And why would you need to re-build not modified files?Segovia
It could save UTC time, can't it?Leveloff
@Leveloff UTC or not, there is no guarantee that the time you are saving is the "right" one.Corazoncorban
@VonC, you never have any guarantee that your time is right. That's not the reason to abandon using time, though.Leveloff
@Leveloff I agree with you, and I have seen in the past some "workaround/hacks" to record it anyway: you can see various scripts in https://mcmap.net/q/12716/-what-39-s-the-equivalent-of-subversion-39-s-quot-use-commit-times-quot-for-git/6309.Corazoncorban
This solution (as pointed in the answer) works really well. Tested on ubuntu 20.04 just now. :)Yonkers
Git does not support the feature but all UIs for Git have this feature. For example Guthub has column on the right with commit time like "last month", "3 days ago", etc. Github, GitLab are distributed so how is this possible? And because all frontends show this then this feature is important for people. The only valid argument is that if people are working in parallel then updating modification time is not safe for make-like tools. This argument is not valid at all when whole folder is updated like in clone and switch. In these cases commit times are safe for make.Balsaminaceous
P
15

Another option for resetting the mtime is git-restore-mtime.

sudo apt install git-restore-mtime # Debian/Ubuntu example
git clone <myurl>
cd <mydir>
git restore-mtime
Patti answered 14/10, 2020 at 23:1 Comment(0)
F
9

This Linux one-liner will fix the problem with all the files (not folders - just files) - and it will also fix the problem with files with spaces in them too:

git ls-files -z | xargs -0 -n1 -I{} -- git log -1 --format="%ai {}" {} | perl -ne 'chomp;next if(/'"'"'/);($d,$f)=(/(^\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d(?: \+\d\d\d\d|)) (.*)/);print "d=$d f=$f\n"; `touch -d "$d" '"'"'$f'"'"'`;'
Fiddlefaddle answered 5/8, 2019 at 3:54 Comment(2)
Very, very nice. Cool!!!! As far as I see it, some files may not yet converted with this solution unfortunately. E.g.: 1. Files containing the character ' (39 0027 ' APOSTROPHE), 2. Files in the root directory of the repository, 3. Files containing a ( (could be also a ) ). Maybe you could find the time to have a look for these specific cases, too?Colorless
That's retrieving from git repo, so would only give commit dates. Not actual file last modification date/time.Concrete
B
6

A shorter variant of user11882487's answer that I find easier to understand:

git ls-files | xargs -I{} git log -1 --date=format:%Y%m%d%H%M.%S --format='touch -t %ad "{}"' "{}" | $SHELL
Bicuspid answered 2/4, 2020 at 4:38 Comment(1)
Works without error, thank you!Maceio
C
3

Adding to the list of one-liners ...

for f in $(git ls-files) ; do touch -d $(git log -1 --format='%aI' "$f") "$f" ; done
Complaisant answered 21/1, 2021 at 8:29 Comment(0)
S
1

This applies to solutions in multiple previous answers:

Use the %at format, and then touch -d \@$epochdelta, to avoid date-time conversion issues.

Swayne answered 16/3, 2021 at 16:42 Comment(0)
A
1

Running log -1 once per file irks me so I wrote this to do them all in one pass:

( # don't alter any modified-file stamps:
  git diff --name-status --no-find-copies --no-renames | awk '$1="D"' FS=$'\t' OFS=$'\t'
  git log --pretty=%cI --first-parent --name-status -m --no-find-copies --no-renames
) | awk ' NF==1 { date=$1 }
          NF<2 || seen[$2]++ { next }
          $1!="D" { print "touch -d",date,$2 }' FS=$'\t'

which does the linux history in like ten seconds (piping all the touch commands through a shell takes a minute).

This is a good way to ruin e.g. bisecting, and I'm in the camp of ~don't even start down the road of trying to overload filesystem timestamps, the people who insist on doing this are apparently going to have to learn the hard way~, but I can see that maybe there's workflows where this really won't hurt you.

Whatever. But, for sure, do not do this blindly.

Amr answered 16/3, 2021 at 18:0 Comment(2)
Can you make your answer more self-contained? E.g., does it follow/operate on the output of git ls-files (and instead of xargs)? What answer(s) does "Running log -1 once per file" refer to (four answers has "log -1")? (Use a link to the answer as user names may change at any time.)Sizzler
@PeterMortensen It prints touch commands as-is, it doesn't need anything added. Pipe them through a shell, which I think the mention of "piping all the touch commands through a shell" suggested explicitly. Any answer that runs log -1 necessarily runs it once per file, my objection is to the method.Amr
V
1

To do this in Python is simpler than some of these other options, as os.utime accepts the Unix timestamp output by the git log command. This example uses GitPython but it'd also work with subprocess.run to call git log.

import git
from os import utime
from pathlib import Path

repo_path = "my_repo"
repo = git.Repo(repo_path)

for n in repo.tree().list_traverse():
    filepath = Path(repo.working_dir) / n.path
    unixtime = repo.git.log(
        "-1", "--format='%at'", "--", n.path
    ).strip("'")
    if not unixtime.isnumeric():
        raise ValueError(
            f"git log gave non-numeric timestamp {unixtime} for {n.path}"
        )
    utime(filepath, times=(int(unixtime), int(unixtime)))

This matches the results of the git restore-mtime command in this answer and the script in the highest rated answer.

If you're doing this immediately after cloning, then you can reuse the to_path parameter passed to git.Repo.clone_from instead of accessing the working_dir attribute on the Repo object.

Veranda answered 2/11, 2021 at 14:47 Comment(0)
S
1

Most of the solutions given so far are unreliable when they don't introduce arbitrary command injection vulnerabilities, either because they call read without IFS= and/or without -r, assume file names don't contain newline characters, forget to quote parameter expansions or command substitutions, forgot the -- option delimiter, don't check exit status or embed the file names in shell code or --format arguments.

This is just a safer variant of the approach given in most answers. Assumes a GNU system:

git ls-tree -zr --name-only HEAD |
  xargs -n20 -r0P10 sh -xc '
    ret=0
    for file do
      d=$(git log -1 --format="@%at" -- "$file") &&
        touch -d "$d" -- "$file" || ret=$?
    done
    exit "$ret"' sh

Here also doing a few in parallel as the task is mostly CPU-bound.

To do it again after a git pull to only update the recently touched files (here based on ctime and assuming GNU find 4.9 or newer), you can insert:

find -files0-from - -prune -cmin -5 -print0

As a pipeline component between git ls-tree and xargs to filter the files last updated in the last 5 minutes.

Superlative answered 25/12, 2023 at 18:16 Comment(0)
D
0

To get the list of files with modification date on Windows you could use the following command (works on PS)

git ls-tree -r --name-only HEAD | ForEach-Object { "$(git log -1 --format="%ai" -- "$_")`t$_" } | sort
Discourteous answered 22/11, 2023 at 11:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.