Getting the list of files sorted by modification date in Perl
Asked Answered
C

5

5

I am trying to get the list of files sorted by modification date. I modified the sample program from Sort Directory and list files based on date and time and tried to run it.

sub get_sorted_files {
    my $path = shift;
    opendir my($dir), $path or die "can't opendir $path: $!";
    my %hash = map {$_ => (stat($_))[9]}
               map  { "$dir$_" }
               grep { m/.*/i }
               readdir $dir;
    closedir $dir;
    return %hash;
}

my %files = get_sorted_files(".");
foreach my $keys (sort{$files{$a} <=> $files{$b}} keys %files) {
    print "$keys\t", scalar localtime($files{$keys}), "\n";
}

I am running this on my Windows XP 32-bit machine using Strawberry Perl version 5.12.1.0.

The directory listing on Windows is:

alt text

The output is:

alt text

The output doesn't make much sense to me. What is going wrong with this piece of code and how exactly is the foreach loop sorting the list of files?

Compelling answered 10/1, 2011 at 20:19 Comment(0)
E
4

In get_sorted_files, $dir is a glob, not the directory name. Perhaps you meant $path?

my %hash = map {$_ => (stat($_))[9]}
           map  { "$path/$_" }              # $path, not $dir
           grep { m/.*/i }
           readdir $dir;
Emancipated answered 10/1, 2011 at 20:29 Comment(1)
Thank you mob! My bad couldn't catch it earlier!Compelling
S
8

There are at least 2 problems with that code. Here's a better version:

use strict;
use warnings; # I bet you weren't using this, because it produced a lot

sub get_sorted_files {
   my $path = shift;
   opendir my($dir), $path or die "can't opendir $path: $!";
   my %hash = map {$_ => (stat($_))[9] || undef} # avoid empty list
           map  { "$path$_" }
           readdir $dir;
   closedir $dir;
   return %hash;
}

my %files = get_sorted_files("./");
foreach my $key (sort{$files{$a} <=> $files{$b}} keys %files) {
   print "$key\t", scalar localtime($files{$key}), "\n";
}

First, you renamed $dir in the original code to $path, but didn't change it in the map line. Your $dir is a directory handle; that's where the GLOB(0x...) is coming from.

Second, all the modification dates read "Wed Dec 31 16:00:00 1969" because you were passing a bad pathname to stat. (stat($_))[9] was returning an empty list (because you were looking for a file like GLOB(0x3f9b38)status.txt instead of the correct pathname) and so the hash actually wound up containing filenames as both keys and values. The first filename was a key, the second was its value, the third was the next key, and so on. localtime was converting the filename to a number (yielding 0), and then converting epoch time 0 (1-Jan-1970 0:00:00 UTC) to your timezone.

Third, it expects $path to end with a directory separator, and you were passing ".". You'd need to pass "./", or better yet, fix it so the function appends a separator if needed.

Fourth, the grep no longer did anything and should be removed. (In the original code, it selected only certain filenames, but you'd changed the pattern to match anything.)

As for how it sorts filenames: get_sorted_files returns a list of pathnames and modification times, which you store into the %files hash. keys %files returns the list of keys (the filenames) and sorts them by a numeric comparison of the associated value (the modification time).

Shenika answered 10/1, 2011 at 21:3 Comment(1)
Thanks a bunch cjm! Bad on my part not to catch that! Learnt quite a bit from your answer. Thanks again.Compelling
O
5

Use Perl's sort function. It's faster and you'll get what you want without the hash.

Size of file, then age of file:

@s = sort {-s $a <=> -s $b || -M $b <=> -M $a} @a;

Knowing the above, we can say something like the below:

sub get_sorted_files {
   my $path = shift;
   opendir my($dirh), $path or die "can't opendir $path: $!";
   my @flist = sort {  -M $a <=> -M $b } # Sort by modification time
               map  { "$path/$_" } # We need full paths for sorting
               readdir $dirh;
   closedir $dirh;
   return @flist;
}
Optimum answered 22/12, 2012 at 3:3 Comment(1)
This won't be that fast: perlmonks.org/?node_id=393128Waterborne
E
4

In get_sorted_files, $dir is a glob, not the directory name. Perhaps you meant $path?

my %hash = map {$_ => (stat($_))[9]}
           map  { "$path/$_" }              # $path, not $dir
           grep { m/.*/i }
           readdir $dir;
Emancipated answered 10/1, 2011 at 20:29 Comment(1)
Thank you mob! My bad couldn't catch it earlier!Compelling
L
1

For really large directories, you may find that Perl is significantly slower than using the native tools to do the sorting. For instance, on my machine, on an enormous (341k files) directory, this takes about 1.5 minutes:

my $mostrecent = `/bin/ls --full-time -lta $dir | head -1 2>/dev/null`;

But the code in the solution above (using opendir and sort -M) takes from 30-45 seconds longer. Not only is it significantly faster, you can also avoid Perl storing the whole array in memory, which can be a win on its own.

Note that the above is on a fairly high-end Linux blade system, so YMMV per computer/OS...

Lasseter answered 14/10, 2013 at 16:45 Comment(0)
H
0

With a one-liner:

perl -E '
    say join "\n",
    sort { -M $a <=> -M $b }
    grep -f, @ARGV
' -- *
Huneycutt answered 20/2, 2023 at 18:48 Comment(2)
This is going to be pretty slow when you get over about 50 files: perlmonks.org/?node_id=393128Waterborne
I don't agree: hastebin.com/share/vonexagedi.swift On 10000 files I get real 0m0,357s I agree Schwarzien transform should be better, but the conciseness of this command have his place IMHO...Impedimenta

© 2022 - 2024 — McMap. All rights reserved.