Scala pattern matching with lowercase variable name
Asked Answered
P

4

15

I found that when using pattern matching with alternatives (for strings), Scala accepts variables starting with upper case (in the example below, MyValue1 and MyValue2), but not those starting with lower case (myValue1, myValue2). Is this a bug or a feature of Scala? I get this in version 2.8. If this is a feature, can anyone explain the rationale behind it? This is the code I used:

val myValue1 = "hello"
val myValue2 = "world"
val MyValue1 = "hello"
val MyValue2 = "world"

var x:String = "test"

x match {
  case MyValue1 | MyValue2 => println ("first match")
  case myValue1 | myValue2 => println ("second match")
}

On running, I get the following:

scala> val myValue1 = "hello"
myValue1: java.lang.String = hello

scala> val myValue2 = "world"
myValue2: java.lang.String = world

scala> val MyValue1 = "hello"
MyValue1: java.lang.String = hello

scala> val MyValue2 = "world"
MyValue2: java.lang.String = world

scala> var x:String = "test"
x: String = test

scala> x match {
 |   case MyValue1 | MyValue2 => println ("first match")
 |   case myValue1 | myValue2 => println ("second match")
 | }
<console>:11: error: illegal variable in pattern alternative
     case myValue1 | myValue2 => println ("second match")
          ^
<console>:11: error: illegal variable in pattern alternative
     case myValue1 | myValue2 => println ("second match")
                     ^

EDIT:

So it is indeed a feature and not a bug... Can anyone provide an example when this might be useful?

When I use:

x match {
   case myValue1 => println ("match")
   case _ => 
}

I get an unreachable code warning on the last case, implying that the first one always matches.

Pyatt answered 18/12, 2010 at 18:42 Comment(4)
This is one of the common programming mistakes in Scala: #1333074. I highly recommend reading that entire thread - it's got a bunch of other gotchas like this one.Pallmall
Thanks for the great reference.Pyatt
A useful example: x match { case myValue1:String => println("match: "+myValue1) ; case _ => } --> myValue1 becomes a local variable.Sopher
"Great thread" -> of COURSE SO deletes it for "reasons of moderation". +1 pedantry, -1 usefulness.Intuitive
O
40

This is not specific to patterns with alternatives, and it is not a bug. An identifier that begins with a lowercase letter in a pattern represents a new variable that will be bound if the pattern matches.

So, your example is equivalent to writing:

x match {
   case MyValue1 | MyValue2 => println ("first match")
   case y | z => println ("second match")
}

You can work around this by using backticks:

x match {
   case MyValue1 | MyValue2 => println ("first match")
   case `myValue1` | `myValue2` => println ("second match")
}
Olivaolivaceous answered 18/12, 2010 at 19:37 Comment(1)
This is a very subtle feature and many will possibly think it is a bug. Selected as the answer for giving the workaround.Pyatt
S
11

It is a feature. Stable identifiers beginning with an uppercase letter are treated like literals for the purpose of pattern matching, and lowercase identifiers are "assigned to" so you can use the matched value for something else.

You gave an example of it not making sense:

x match {
   case myValue1 => println ("match")
   case _ => 
}

But the sense is easy to see if we change that a little:

x match {
   case MyValue1 => println("match")
   case MyValue2 => println("match")
   case other    => println("no match: "+other)
}

Of course, one could use x instead of other above, but here are some examples where that would not be convenient:

(pattern findFirstIn text) {
    // "group1" and "group2" have been extracted, so were not available before
    case pattern(group1, group2) =>

    // "other" is the result of an expression, which you'd have to repeat otherwise
    case other =>
}

getAny match {
    // Here "s" is a already a string, whereas "getAny" would have to be typecast
    case s: String =>

    // Here "i" is a already an int, whereas "getAny" would have to be typecase
    case i: Int =>
}

So there are many reasons why it is convenient for pattern matching to assign the matched value to an identifier.

Now, though I think this is one of the greatest misfeatures of Scala, because it is so subtle and unique, the reasoning behind it is that, in the recommended Scala style, constants are camel cased starting with an uppercase letter, while methods and vals and vars (which are really methods too) are camel cased starting with lowercase letters. So constants are naturally treated as literals, while others are treated as assignable identifiers (which may shadow identifiers defined in an outer context).

Snapshot answered 18/12, 2010 at 22:32 Comment(0)
C
5

What's happening here is that myValue1 and myValue2 are being treated as variable identifiers (i.e., the definition of new variables that are bound to the value being matched), whereas MyValue1 and MyValue2 are treated as stable identifiers that refer to values declared earlier. In a pattern match case, variable identifiers must start with a lower case letter, hence why the first case behaves intuitively. See section 8.1 of the Scala Language Specification (http://www.scala-lang.org/docu/files/ScalaReference.pdf) for exact details.

Altering your example slightly, you can see the variable identifier:

scala> x match {
 | case MyValue1 | MyValue2 => println ("first match")
 | case myValue1 => println (myValue1)
 | }
test
Catharinecatharsis answered 18/12, 2010 at 19:33 Comment(1)
Thanks for the reference. The exact section is 8.1.1. I could only choose one correct answer.Pyatt
S
2

If it helps, I just posted an article on this topic a week or so @ http://asoftsea.tumblr.com/post/2102257493/magic-match-sticks-and-burnt-fingers

Sumikosumma answered 18/12, 2010 at 23:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.