How do the Perl 6 set operations compare elements?
Asked Answered
C

5

9

Running under moar (2016.10)

Consider this code that constructs a set and tests for membership:

my $num_set = set( < 1 2 3 4 > );
say "set: ", $num_set.perl;
say "4 is in set: ", 4 ∈ $num_set;
say "IntStr 4 is in set: ", IntStr.new(4, "Four") ∈ $num_set;
say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4;
say "5 is in set: ", 5 ∈ $num_set;

A straight 4 is not in the set, but the IntStr version is:

set: set(IntStr.new(4, "4"),IntStr.new(1, "1"),IntStr.new(2, "2"),IntStr.new(3, "3"))
4 is in set: False
IntStr 4 is in set: True
IntStr(4,...) is 4: True
5 is in set: False

I think most people aren't going to expect this, but the docs doesn't say anything about how this might work. I don't have this problem if I don't use the quote words (i.e. set( 1, 2, 3, 4)).

Crum answered 26/11, 2016 at 4:32 Comment(0)
C
1

I think this is a bug, but not in the set stuff. The other answers were very helpful in sorting out what was important and what wasn't.

I used the angle-brackets form of the quote words. The quote words form is supposed to be equivalent to the quoting version (that is, True under eqv). Here's the doc example:

<a b c> eqv ('a', 'b', 'c')

But, when I try this with a word that is all digits, this is broken:

 $ perl6
 > < a b 137 > eqv ( 'a', 'b', '137' )
 False

But, the other forms work:

> qw/ a b 137 / eqv ( 'a', 'b', '137' )
True
> Q:w/ a b 137 / eqv ( 'a', 'b', '137' )
True

The angle-bracket word quoting uses IntStr:

> my @n = < a b 137 >
[a b 137]
> @n.perl
["a", "b", IntStr.new(137, "137")]

Without the word quoting, the digits word comes out as [Str]:

> ( 'a', 'b', '137' ).perl
("a", "b", "137")
> ( 'a', 'b', '137' )[*-1].perl
"137"
> ( 'a', 'b', '137' )[*-1].WHAT
(Str)
> my @n = ( 'a', 'b', '137' );
[a b 137]
> @n[*-1].WHAT
(Str)

You typically see these sorts of errors when there are two code paths to get to a final result instead of shared code that converges to one path very early. That's what I would look for if I wanted to track this down (but, I need to work on the book!)

This does highlight, though, that you have to be very careful about sets. Even if this bug was fixed, there are other, non-buggy ways that eqv can fail. I would have still failed because 4 as Int is not "4" as Str. I think this level of attention to data types in unperly in it's DWIMery. It's certainly something I'd have to explain very carefully in a classroom and still watch everyone mess up on it.

For what it's worth, I think the results of gist tend to be misleading in their oversimplification, and sometimes the results of perl aren't rich enough (e.g. hiding Str which forces me to .WHAT). The more I use those, the less useful I find them.

But, knowing that I messed up before I even started would have saved me from that code spelunking that ended up meaning nothing!

Crum answered 26/11, 2016 at 23:29 Comment(8)
Could you clarify what you consider the bug to be? As far as I can tell, this is all by design: (a) <...> goes through &val, which returns allomorphs if possible (b) set membership is defined in terms of identity, which distinguishes between allomorphs and their corresponding value types; so I would not classify it as a bug, but 'broken' by design; or phrased another way, it's just the WAT that comes with this particular DWIMEnhanced
This was intentionally added, and is part of the testsuite. ( I can't seem to find anywhere that tests for <…> being equivalent to q:w:v<…> and <<…>>/«…» being equivalent to qq:ww:v<<…>> )Acth
The docs say the two lists should be eqv, and they are not. If they are not meant to be equivalent, the docs need to change. Nothing in docs.perl6.org/language/quoting#Word_quoting:_qw mentions any of this stuff.Crum
The documentation seems to be just wrong here, <...> does not correspond to qw(...), but qw:v(...). Cf S02 for the description of the adverb and this test that Brad was <del>looking for</del> already linked toEnhanced
or perhaps not outright wrong, but rather 'just' misleading: <...> is indeed a :w form, and the given example code does compare equal according to eqvEnhanced
Zoffix fixed the docs in github.com/perl6/doc/commit/…Crum
@Permissive I think you should undelete your answer. It's good stuff. I think Perl 6 is in that tough step where it works and makes sense for the people creating it, but now it's time for the people who just want to use it. That's a different audience that will give up much sooner that I do. Zoffix's change to the docs goes a long way toward that.Crum
@Permissive that makes sense :)Crum
B
7

You took a wrong turn in the middle. The important part is what nqp::existskey is called with: the k.WHICH. This method is there for value types, i.e. immutable types where the value - rather than identity - defines if two things are supposed to be the same thing (even if created twice). It returns a string representation of an object's value that is equal for two things that are supposed to be equal. For <1>.WHICH you get IntStr|1 and for 1.WHICH you get just Int|1.

Blubber answered 26/11, 2016 at 5:47 Comment(1)
Ah, okay. I can see a lot of pain for regular people trying to debug these things.Crum
D
4

As explained in the Set documentation, sets compare object identity, same as the === operator:

Within a Set, every element is guaranteed to be unique (in the sense that no two elements would compare positively with the === operator)

The identity of an object is defined by the .WHICH method, as timotimo elaborates in his answer.

Drabbet answered 26/11, 2016 at 14:46 Comment(3)
That's not really clear from that statement. That's talking about which elements are in the set. Beyond that, even if you choose to compare with ===, you have to know how other things are stored. This is the sort of info that should show up next to the Set operators.Crum
Indeed, I think I've found a bug. The qw docs says this should be true: < a b 137 > eqv ( 'a', 'b', '137' ), but in the same version of Rakudo Star I get false. It's different object types on each side.Crum
Despite all this, your answer was the A-ha! moment that led me to look at the right thing. Thanks for all of your help.Crum
P
2

Write your list of numbers using commas

As you mention in your answer, your code works if you write your numbers as a simple comma separated list rather than using the <...> construct.

Here's why:

4 ∈ set 1, 2, 3, 4 # True

A bare numeric literal in code like the 4 to the left of constructs a single value with a numeric type. (In this case the type is Int, an integer.) If a set constructor receives a list of similar literals on the right then everything works out fine.

<1 2 3 4> produces a list of "dual values"

The various <...> "quote words" constructs turn the list of whitespace separated literal elements within the angle brackets into an output list of values.

The foundational variant (qw<...>) outputs nothing but strings. Using it for your use case doesn't work:

4 ∈ set qw<1 2 3 4> # False

The 4 on the left constructs a single numeric value, type Int. In the meantime the set constructor receives a list of strings, type Str: ('1','2','3','4'). The operator doesn't find an Int in the set because all the values are Strs so returns False.

Moving along, the huffmanized <...> variant outputs Strs unless an element is recognized as a number. If an element is recognized as a number then the output value is a "dual value". For example a 1 becomes an IntStr.

According to the doc "an IntStr can be used interchangeably where one might use a Str or an Int". But can it?

Your scenario is a case in point. While 1 ∈ set 1,2,3 and <1> ∈ set <1 2 3> both work, 1 ∈ set <1 2 3> and <1> ∈ set 1, 2, 3 both return False.

So it seems the operator isn't living up to the quoted doc's claim of dual value interchangeability.

This may already be recognized as a bug in the set operation and/or other operations. Even if not, this sharp "dual value" edge of the <...> list constructor may eventually be viewed as sufficiently painful that Perl 6 needs to change.

Permissive answered 27/11, 2016 at 4:50 Comment(0)
C
1

I think this is a bug, but not in the set stuff. The other answers were very helpful in sorting out what was important and what wasn't.

I used the angle-brackets form of the quote words. The quote words form is supposed to be equivalent to the quoting version (that is, True under eqv). Here's the doc example:

<a b c> eqv ('a', 'b', 'c')

But, when I try this with a word that is all digits, this is broken:

 $ perl6
 > < a b 137 > eqv ( 'a', 'b', '137' )
 False

But, the other forms work:

> qw/ a b 137 / eqv ( 'a', 'b', '137' )
True
> Q:w/ a b 137 / eqv ( 'a', 'b', '137' )
True

The angle-bracket word quoting uses IntStr:

> my @n = < a b 137 >
[a b 137]
> @n.perl
["a", "b", IntStr.new(137, "137")]

Without the word quoting, the digits word comes out as [Str]:

> ( 'a', 'b', '137' ).perl
("a", "b", "137")
> ( 'a', 'b', '137' )[*-1].perl
"137"
> ( 'a', 'b', '137' )[*-1].WHAT
(Str)
> my @n = ( 'a', 'b', '137' );
[a b 137]
> @n[*-1].WHAT
(Str)

You typically see these sorts of errors when there are two code paths to get to a final result instead of shared code that converges to one path very early. That's what I would look for if I wanted to track this down (but, I need to work on the book!)

This does highlight, though, that you have to be very careful about sets. Even if this bug was fixed, there are other, non-buggy ways that eqv can fail. I would have still failed because 4 as Int is not "4" as Str. I think this level of attention to data types in unperly in it's DWIMery. It's certainly something I'd have to explain very carefully in a classroom and still watch everyone mess up on it.

For what it's worth, I think the results of gist tend to be misleading in their oversimplification, and sometimes the results of perl aren't rich enough (e.g. hiding Str which forces me to .WHAT). The more I use those, the less useful I find them.

But, knowing that I messed up before I even started would have saved me from that code spelunking that ended up meaning nothing!

Crum answered 26/11, 2016 at 23:29 Comment(8)
Could you clarify what you consider the bug to be? As far as I can tell, this is all by design: (a) <...> goes through &val, which returns allomorphs if possible (b) set membership is defined in terms of identity, which distinguishes between allomorphs and their corresponding value types; so I would not classify it as a bug, but 'broken' by design; or phrased another way, it's just the WAT that comes with this particular DWIMEnhanced
This was intentionally added, and is part of the testsuite. ( I can't seem to find anywhere that tests for <…> being equivalent to q:w:v<…> and <<…>>/«…» being equivalent to qq:ww:v<<…>> )Acth
The docs say the two lists should be eqv, and they are not. If they are not meant to be equivalent, the docs need to change. Nothing in docs.perl6.org/language/quoting#Word_quoting:_qw mentions any of this stuff.Crum
The documentation seems to be just wrong here, <...> does not correspond to qw(...), but qw:v(...). Cf S02 for the description of the adverb and this test that Brad was <del>looking for</del> already linked toEnhanced
or perhaps not outright wrong, but rather 'just' misleading: <...> is indeed a :w form, and the given example code does compare equal according to eqvEnhanced
Zoffix fixed the docs in github.com/perl6/doc/commit/…Crum
@Permissive I think you should undelete your answer. It's good stuff. I think Perl 6 is in that tough step where it works and makes sense for the people creating it, but now it's time for the people who just want to use it. That's a different audience that will give up much sooner that I do. Zoffix's change to the docs goes a long way toward that.Crum
@Permissive that makes sense :)Crum
B
1

Just to add to the other answers and point out a consistancy here between sets and object hashes.

An object hash is declared as my %object-hash{Any}. This effectively hashes on objects .WHICH method, which is similar to how sets distinguish individual members.

Substituting the set with an object hash:

my %obj-hash{Any};

%obj-hash< 1 2 3 4 > = Any;
say "hash: ", %obj-hash.keys.perl;
say "4 is in hash: ", %obj-hash{4}:exists;
say "IntStr 4 is in hash: ", %obj-hash{ IntStr.new(4, "Four") }:exists;
say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4;
say "5 is in hash: ", %obj-hash{5}:exists;

gives similar results to your original example:

hash: (IntStr.new(4, "4"), IntStr.new(1, "1"), IntStr.new(2, "2"), IntStr.new(3, "3")).Seq
4 is in hash: False
IntStr 4 is in hash: True
IntStr(4,...) is 4: True
5 is in hash: False
Billet answered 27/11, 2016 at 18:33 Comment(1)
I agree its not great, as it is.Billet

© 2022 - 2024 — McMap. All rights reserved.