Why does the =~ operator only sometimes have side effects?
Asked Answered
E

2

12

I've noticed a side effect in Ruby/Oniguruma that is only present in 1 out of 4 seemingly equivalent statements. Why is the variable day defined in 009, but not in 003, 005 or 007?

irb(main):001:0> r = /(?<day>\d\d):(?<mon>\d\d)/
=> /(?<day>\d\d):(?<mon>\d\d)/

irb(main):002:0> r =~ "24:12"
=> 0
irb(main):003:0> day
NameError: undefined local variable or method `day' 

irb(main):004:0> "24:12" =~ r
=> 0
irb(main):005:0> day
NameError: undefined local variable or method `day'


irb(main):006:0> "24:12" =~ /(?<day>\d\d):(?<mon>\d\d)/
=> 0
irb(main):007:0> day
NameError: undefined local variable or method `day'


irb(main):008:0> /(?<day>\d\d):(?<mon>\d\d)/ =~ "24:12"
=> 0
irb(main):009:0> day
=> "24"

nb#1: It's the same regex and the same string in all four cases.

nb#2: I've verified the behavior in MS Windows and Ubuntu Linux.

Eclogue answered 25/5, 2011 at 12:32 Comment(1)
Note: although this is not the case here, you should be careful when playing around with local variables in IRb. They may behave slightly differently in IRb than in a script, due to the way the code is being evaluated in IRb. Always write scripts to confirm.Metre
C
13

When you call "24:12" =~ r you actually call "24:12".=~(r). So, String#=~ just returns the position the match starts, or nil if there is no match.

But when you call /(?<day>\d\d):(?<mon>\d\d)/ =~ "24:12" you actually call Regexp#=~

And as the documentation says

If =~ is used with a regexp literal with named captures, captured strings (or nil) is assigned to local variables named by the capture names.

what about 003:

The assignment is not occur if the regexp is not a literal.

   re = /(?<lhs>\w+)\s*=\s*(?<rhs>\w+)/
   re =~ "  x = y  "
   p lhs    # undefined local variable
   p rhs    # undefined local variable

and

The assignment is not occur if the regexp is placed at right hand side.
" x = y " =~ /(?<lhs>\w+)\s*=\s*(?<rhs>\w+)/
p lhs, rhs # undefined local variable

Connote answered 25/5, 2011 at 12:49 Comment(5)
Thanks Nash. A very good answer. But can you also explain 003 in the question -- i.e. a compiled regex (not a regex literal) receives the =~ message and it doesn't assign the local variable.Celebrity
"Captured strings is assigned...", "The assignment is not occur..."? Strange grammar...Correct
@Tim Pietzckerh, you can make it better! ;) ruby-doc.org/documentation-guidelines.htmlConnote
@Tim Pietzcker: the author of that paragraph would probably say the same about your Japanese ;-)Metre
I just made an error report to ruby about the mistake in the doc (comment).Mamoun
A
1

I believe 003 isn't supported because it's a full blown Regexp object in Rubyland at that point, possibly with overridden methods and such. That makes the scope of assigned locals a lot more complicated.

Afterbrain answered 25/5, 2011 at 13:52 Comment(1)
Thanks, James. I had another idea about the intention from the Ruby creators, but you're probably right that it's a technical issue.Celebrity

© 2022 - 2024 — McMap. All rights reserved.