Perl: mapping to lists' first element

Asked 20/1, 2012 at 18:4 Answered 21/1, 2012 at 18:48

Solved arrays perl hash dictionary undef

Task: to build hash using map, where keys are the elements of the given array @a, and values are the first elements of the list returned by some function f($element_of_a):

my @a = (1, 2, 3);
my %h = map {$_ => (f($_))[0]} @a;

All the okay until f() returns an empty list (that's absolutely correct for f(), and in that case I'd like to assign undef). The error could be reproduced with the following code:

my %h = map {$_ => ()[0]} @a;

the error itself sounds like "Odd number of elements in hash assignment". When I rewrite the code such that:

my @a = (1, 2, 3);
my $s = ()[0];
my %h = map {$_ => $s} @a;

my @a = (1, 2, 3);
my %h = map {$_ => undef} @a;

Perl does not complain at all.

So how should I resolve this — get first elements of list returned by f(), when the returned list is empty?

Perl version is 5.12.3

Thanks.

Assimilable answered 20/1, 2012 at 18:4 Comment(1)

Wrap the call to f so that when it returns an empty list, you supply undef or otherwise the first element of the list it returned. – Sheepish 20/1, 2012 at 18:16

I've just played around a bit, and it seems that ()[0], in list context, is interpreted as an empty list rather than as an undef scalar. For example, this:

my @arr = ()[0];
my $size = @arr;
print "$size\n";

prints 0. So $_ => ()[0] is roughly equivalent to just $_.

To fix it, you can use the scalar function to force scalar context:

my %h = map {$_ => scalar((f($_))[0])} @a;

or you can append an explicit undef to the end of the list:

my %h = map {$_ => (f($_), undef)[0]} @a;

or you can wrap your function's return value in a true array (rather than just a flat list):

my %h = map {$_ => [f($_)]->[0]} @a;

(I like that last option best, personally.)

The special behavior of a slice of an empty list is documented under “Slices” in perldata:

A slice of an empty list is still an empty list. […] This makes it easy to write loops that terminate when a null list is returned:
while ( ($home, $user) = (getpwent)[7,0]) {
    printf "%-8s %s\n", $user, $home;
}

Eastlake answered 20/1, 2012 at 18:16 Comment(8)

I have edited in a cite to the documentation which explains why ()[0] returns an empty list instead of undef. If you do not approve, please feel free to revert my edit (or better yet, improve it). – Crankpin 20/1, 2012 at 19:30

@derobert: I strongly approve. Thank you very much! – Eastlake 20/1, 2012 at 20:38

Nothing wrong with the analysis here, but that's a lot of line noise. I'd personally prefer to have f handle the edge case as it makes the code more maintainable that way. Of course, if there is no control over the definition of f, then that is a different matter altogether – Bewick 20/1, 2012 at 21:57

@Zaid: We don't have enough information to really say for sure, but I'm inclined to disagree. f is designed to return a list of values; presumably this line of code, which discards all but the first value, is the exception rather than the rule, and presumably it would cause maintenance headache if every other call to f had to explicitly check for the case that it returned undef and translate that back to the empty list. (Note that the OP writes that it's "absolutely correct for" "f() [to] return[] an empty list". This implies that f currently has a meaningful, cohesive definition.) – Eastlake 20/1, 2012 at 22:15

Instead of the "change f" route, there is also the "wrap f" route. Could certainly make a sub g { my $n = shift; ( f($n) )[0] // undef } # or any of the alternative ways to write this, if you're doing that map a lot. Or if its with a bunch of functions, you could make a higher-order version of g as well to dynamically wrap things. – Crankpin 21/1, 2012 at 10:24

derobert, much thanks for your reply. Could you clarify, why slice length of empty list ()[...] is zero, while length of []->[...] (in list context I mean) is one? Yes, I've got length of 1 not for []->[0] only, but also for []->[0, 1], and any list of indices. – Assimilable 22/1, 2012 at 6:53

I've just checked the one else form, @{[]}[...], and found that length of that slice is equal to length of .... Say for @s = @{[]}[1, 1, 1], scalar(@s) is 3 :-) – Assimilable 22/1, 2012 at 7:7

Got the point: []->[...] slice is read in scalar context so returns the last item only, while @{[]}[...] is read in list context and returns the list built from specified items. – Assimilable 22/1, 2012 at 10:55

I second Jonathan Leffler's suggestion - the best thing to do would be to solve the problem from the root if at all possible:

sub f {

    # ... process @result

    return @result ? $result[0] : undef ;
}

The explicit undef is necessary for the empty list problem to be circumvented.

Bewick answered 20/1, 2012 at 21:55 Comment(0)

At first, much thanks for all repliers! Now I'm feeling that I should provide the actual details of the real task.

I'm parsing a XML file containing the set of element each looks like that:

<element>
    <attr_1>value_1</attr_1>
    <attr_2>value_2</attr_2>
    <attr_3></attr_3>
</element>

My goal is to create Perl hash for element that contains the following keys and values:

('attr_1' => 'value_1',
 'attr_2' => 'value_2',
 'attr_3' =>  undef)

Let's have a closer look to <attr_1> element. XML::DOM::Parser CPAN module that I use for parsing creates for them an object of class XML::DOM::Element, let's give the name $attr for their reference. The name of element is got easy by $attr->getNodeName, but for accessing the text enclosed in <attr_1> tags one has to receive all the <attr_1>'s child elements at first:

my @child_ref = $attr->getChildNodes;

For <attr_1> and <attr_2> elements ->getChildNodes returns a list containing exactly one reference (to object of XML::DOM::Text class), while for <attr_3> it returns an empty list. For the <attr_1> and <attr_2> I should get value by $child_ref[0]->getNodeValue, while for <attr_3> I should place undef into the resulting hash since no text elements there.

So you see that f function's (method ->getChildNodes in real life) implementation could not be controlled :-) The resulting code that I have wrote is (the subroutine is provided with list of XML::DOM::Element references for elements <attr_1>, <attr_2>, and <attr_3>):

sub attrs_hash(@)
{
    my @keys = map {$_->getNodeName} @_;  # got ('attr_1', 'attr_2', 'attr_3')
    my @child_refs = map {[$_->getChildNodes]} @_;  # got 3 refs to list of XML::DOM::Text objects
    my @values = map {@$_ ? $_->[0]->getNodeValue : undef} @child_refs;  # got ('value_1', 'value_2', undef)

    my %hash;
    @hash{@keys} = @values;

    %hash;
}

Assimilable answered 21/1, 2012 at 18:48 Comment(2)

I wish you had mentioned this up front. You'll only get an answer as good as the question you ask. Too bad that this information wasn't made available before. – Bewick 22/1, 2012 at 9:57

Why? I suppose myself having the perfect answers which allowed me to clarify a lot of points concerning lists and slices :-) – Assimilable 22/1, 2012 at 12:28

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags