Making a perl array unique
Asked Answered
P

4

5

I am currently having a very simple problem with capturing the output from a backticked shell command. I apologize that the problem is rather simple.

I have some sorted array (@valid_runs) which I know contains consecutive, duplicate elements. I want to use backticks to echo this array to uniq. I want to capture the STDOUT in an array. I attempt to do so like this.

@unique_valids = `echo '@valid_runs' | uniq`;
print @unique_valids;

This print statement yields nothing. For that matter neither does this.

@unique_valids = `echo '@valid_runs'`;
print @unique_valids;

I know how to use uniq and echo. This seems rather odd to me. I would think this has more to do with perl arrays than proper use of those commands. I have searched around a bit elsewhere, so please don't pelt me with downvotes just because the solution might seem trivial. Thanks again for your time.

NOTE on solutions: TLP's solution is the most straightforward as far handling the uniq problem. I am being rather flexible, since all responses suggested not making a system call for this problem. If Perl's uniq function the same as Unix's uniq then the array ought to remain sorted.

John Corbett's solution works well if you don't care about a sorted result.

Plastid answered 3/8, 2012 at 18:46 Comment(4)
one potention problem: do the elements of @valid_runs end in a newline? otherwise echo will only produce one line of output as input to uniqNalda
are you sure @valid_runs is not empty? this prints something for me: @x=(3,2,1);@y = `echo '@x'`;print @y;Nalda
@user5402 I am sure that @valid runs is not empty. I am sure that each entry has a newline at then end of the string. You example works. Which is nice; shows me that I am not proposing anything too crazy. Here is a snippet of @valid_runs: /raid1/home/pharmacy/morguna/1experiments_copy/1experiments/test10-10/run36415 /raid1/home/pharmacy/morguna/1experiments_copy/1experiments/test10-10/run36415 /raid1/home/pharmacy/morguna/1experiments_copy/1experiments/test10-10/run36416 /raid1/home/pharmacy/morguna/1experiments_copy/1experiments/test10-10/run36416 Plastid
For a discussion of all sorts of things you might want to do with arrays in Perl, such as getting the difference between two arrays, see the Perl FAQ: perldoc.perl.org/perlfaq4.html#Data:-ArraysPachisi
P
9

Using system calls for something that can easily be accomplished with perl code is not a good idea. The module List::MoreUtils has a uniq function that does what you require:

use List::MoreUtils qw(uniq);

my @unique = uniq @runs;

The subroutine inside the module itself is very simple, though, exactly like theglauber's answer:

sub uniq (@) {
    my %seen = ();
    grep { not $seen{$_}++ } @_;
}
Poul answered 3/8, 2012 at 20:21 Comment(1)
Yeah, for the current solution I am just implementing something within Perl. Making unnecessary system calls is bad. I was just being lazy since I already knew of uniq. Thanks for your time.Plastid
A
8

you should just store the array into a hash, because hash keys are always unique. You can do that like this:

my %temp_hash = map { $_ => 1 } @valid_runs;
my @unique_valids = keys %temp_hash;

that's the perl way of doing it anyway. There's no need to use back tics here (I try to avoid those as much as I can).

Attis answered 3/8, 2012 at 18:55 Comment(3)
This doesn't preserve the order of the array.Walter
he didn't say it needed to be preservedAttis
In this context it is not essential, however I do wish to be able to do something so simple as throw an array into shell commands. @JohnCorbett Thanks for your solution. It achieves what needs to be done, but I still want to know how to do what I proposed.Plastid
M
5

It's easy to do this in perl. Here's a rather obscure but fun way to dedup an array:

@dedup = grep !$seen{$_}++ @orig_array;

Figure out what this is doing by checking the documentation for the perl function grep.

If you have to use uniq, you probably need to put each array element in a separate line.

join("\n", @your_array)

should achieve that.

Mccrae answered 3/8, 2012 at 18:56 Comment(0)
W
-1
#!/usr/bin/perl
use warnings;

@a = (1, 2, 3, 3, 4, 4, 5);

$cmd = "/usr/bin/uniq <<EOF\n";
$cmd .= $_."\n" foreach (@a);
$cmd .= "EOF\n";

$result = `$cmd`;
print "Cmd: $cmd\n";
print "Result is $result";

@u = split /\n/,$result;
print "After ",join " ",@u,"\n";

This does what you ask, but theglauber's answer is still better Perl.

Walter answered 3/8, 2012 at 20:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.