Get large list of files, sorted by file time in *milliseconds*
Asked Answered
H

1

10

I know my file system is storing the file modification time in milliseconds but I don't know of a way to access that information via PHP. When I do an ls --full-time I see this:

-rw-r--r-- 1 nobody nobody 900 2012-06-29 14:08:37.047666435 -0700 file1
-rw-r--r-- 1 nobody nobody 900 2012-06-29 14:08:37.163667038 -0700 file2

I'm assuming that the numbers after the dot are the milliseconds.

So I realize I could just use ls and have it sort by modification time, like this:

$filelist = `ls -t`;

However, the directory sometimes has a massive number of files and I've noticed that ls can be pretty slow in those circumstances.

So instead, I've been using find but it doesn't have a switch for sorting results by modification time. Here's an example of what I'm doing now:

$filelist = `find $dir -type f -printf "%T@ %p\n" | sort -n | awk '{print $2}'`;

And, of course, this doesn't sort down to the milliseconds so files that were created in the same second are sometimes listed in the wrong order.

Holcombe answered 29/6, 2012 at 23:28 Comment(6)
Why does sort -n not sort to the milliseconds? It would seem to me that two timestamps 1341013506.3000000000 and 1341013506.6000000000 would be sorted numerically still no? I can't replicate because my machine does not store ms file times :)Cupo
@mattedgod: check your /sys/ directory; on my system, it supports nanosecond times.Rorie
@Rorie Hmm that is strange, the sys directory is in nanoseconds but most of the document directories are not. Interesting...Cupo
@mattedgod /sys is a virtual filesystem of type sysfs, not a real file system. execute the mount command to see what filesystem your real mounts are using - in many cases it will be ext3 and that one doesn't have ns timestamps.Miscount
@mattedgod: it's all about your filesystem types -- ext4 supports nanosecond, ext2 and ext3 do not. /sys is type sysfs, which also supports nanosecond resolution.Rorie
Ah interesting. Well when I run the find command in the OP in those sys folders it sorts properly, even by nanosecondsCupo
M
6

Only a few filesystems (like EXT4) actually store these times up to nanosecond precision. It's not something that's guaranteed to be available, on other filesystems (like EXT3) you'll notice that the fractional part is .000000000

Now, if this feature is really important for you, you could write a specialized PHP extension. This will bypass the calls to external utilities and should be a great deal faster. The process of creating extension is well explained in many places, like here. A reasonable approach to such an extension could be an alternative fstat function implementation that exposes the high-precision fields available in the stat structure defined in /usr/include/bits/stat.h nowadays.

As usual, nothing is free. This extension will have to be maintained, it's probably not possible to get it running on hosted environments, etc. Plus, your php solution will only run on servers where your extension was deployed (although that can circumvented by falling back on the ls based technique if the extension is not detected).

Miscount answered 29/6, 2012 at 23:42 Comment(1)
I had a feeling there wasn't going to be a simple solution to this. Thanks for sharing your thoughts. :-)Holcombe

© 2022 - 2024 — McMap. All rights reserved.