Perl: Using Loop or Map/Grep?
Asked Answered
T

3

5

I'm writing a program to step through a directory tree (Yes, I know about File::Find, but I'm writing a replacement).

In my program, I'm doing a readdir on a whole directory and placing it in a list. I need to do two things:

  1. Remove . and .. from the list
  2. Prepend the current directory name on each file.

I can do this with a loop, or I can use map and grep:

# Map and Grep

my @dir_stack = readdir $dir_fh;;
@dir_stack = grep { !/^\.{1,2}$/ } @dir_stack;
@dir_stack = reverse map { "$cwd/$_" } @dir_stack;
push @stack, @dir_stack;

# Read Loop

opendir $dir_fh, $cwd;
my @dir_stack;
foreach my $file (readdir $dir_fh) {
    next if $file =~ /^\.{1,2}$/;   #Skip "." and ".."
    unshift @dir_stack, "$cwd/$file";
}
push @stack, @dir_stack;

What about combining grep and map?

 opendir $dir_fh, $cwd;
 my @dir_stack = readdir $dir_fh;;
 @dir_stack = grep { !/^\.{1,2}$/ && {$_ = "$cwd/$_"} } @dir_stack;
 push @stack, reverse @dir_stack;

I want my code to be readable next week when I look at it and try to figure out what's going on. I also need my code to be efficient.

Telluric answered 12/12, 2011 at 19:44 Comment(3)
Is there a subtle reason for your double semicolons after readdir?Invention
@TimN - Nope. I didn't catch it in my code.Telluric
Another one? Have mercy, for CPAN’s sake.Sonorous
A
7

Modifying $_ in grep? yuck! And what's with using an anon hash constructor?

@dir_stack = grep { !/^\.{1,2}$/ && {$_ = "$cwd/$_"} } @dir_stack;

should be

@dir_stack = map { /^\.\.?\z/ ? () : "$cwd/$_" } @dir_stack;

But I personally find using both map and grep more readable than combining them.

push @stack,
   reverse
    map "$cwd/$_",
     grep !/^\.\.?\z/,
      readdir $dh;

The need for reverse is rather odd, and it's much more visible here than hiding as a unshift, so that's another bonus.

Amethyst answered 12/12, 2011 at 19:54 Comment(5)
+1: If not combining map and grep leads to performance problems, the directory is way, way too big.Invention
I combined map and grep in a single grep for performance reasons. I've noticed that map and grep aren't that much more efficient than a loop. If I used them both, wouldn't a loop would be more efficient? Reverse/unshift isn't needed, but I want to keep the files in order as they are with find. Does combining all of these commands into a single super command more efficient than doing it line-by-line? Can they operate in parallel, or doe grep have to complete and return a list before map can do anything. If it's the latter, I'd prefer to list each line.Telluric
@David W., readdir does return the same order as find. You're reversing it twice: Once using reverse and then once using a stack. To preserve order, drop the reverse, and use a queue instead of a stack (push+shift instead of push+pop).Amethyst
@David W., To paraphrase your question, "does removing three array assignments make it more efficient than doing it line-by-line", and the answer is yes. Not only is it more readable, not doing three assignments is faster than doing three assignments.Amethyst
@David W., What makes you think than combining into that grep is faster than combining into that map?Amethyst
D
2

To make your code more readable, you just need to include one more line:

# exclude '.' and '..', and prepend dir name to each elem in @dir_stack

:-)

Drayman answered 12/12, 2011 at 19:49 Comment(1)
Comments shouldn't repeat the code, but state the goal. "# Full path names of directory contents." is a much better comment.Amethyst
P
1

Sounds like you may want glob instead. Though I believe it will exclude all files beginning with . (i.e. hidden files), not just . and ... And of course, you can't have spaces in the path.

my @stack = glob "$dir_fh/*";

It will return as long a paths as you feed it.

Party answered 12/12, 2011 at 20:13 Comment(3)
I'm running v5.10 and glob "*" includes both ., .. and hidden files and directories.Damselfly
@Party : The $] in the codepad paste is confusing. What does $^V give you?Perambulate
@Perambulate version which is not a character recognized by my browser. Not sure why that happens, but it's on codepad's end. So I used $] instead.Party

© 2022 - 2024 — McMap. All rights reserved.