C# switch statement limitations - why?
Asked Answered
Y

17

151

When writing a switch statement, there appears to be two limitations on what you can switch on in case statements.

For example (and yes, I know, if you're doing this sort of thing it probably means your object-oriented (OO) architecture is iffy - this is just a contrived example!),

  Type t = typeof(int);

  switch (t) {

    case typeof(int):
      Console.WriteLine("int!");
      break;

    case typeof(string):
      Console.WriteLine("string!");
      break;

    default:
      Console.WriteLine("unknown!");
      break;
  }

Here the switch() statement fails with 'A value of an integral type expected' and the case statements fail with 'A constant value is expected'.

Why are these restrictions in place, and what is the underlying justification? I don't see any reason why the switch statement has to succumb to static analysis only, and why the value being switched on has to be integral (that is, primitive). What is the justification?

Yeomanry answered 4/9, 2008 at 22:34 Comment(3)
See possible workaround Is there a better alternative than this to 'switch on type'?Electrocardiogram
Another option for switching on built-in types is to use the TypeCode Enum.Pontius
Simply, Create an ENUM and use NameOf in Switch case. It will work as constant on a dynamic variable.Weikert
G
107

This is my original post, which sparked some debate... because it is wrong:

The switch statement is not the same thing as a big if-else statement. Each case must be unique and evaluated statically. The switch statement does a constant time branch regardless of how many cases you have. The if-else statement evaluates each condition until it finds one that is true.


In fact, the C# switch statement is not always a constant time branch.

In some cases the compiler will use a CIL switch statement which is indeed a constant time branch using a jump table. However, in sparse cases as pointed out by Ivan Hamilton the compiler may generate something else entirely.

This is actually quite easy to verify by writing various C# switch statements, some sparse, some dense, and looking at the resulting CIL with the ildasm.exe tool.

Gabey answered 4/9, 2008 at 22:51 Comment(5)
As noted in other answers (including mine), the claims made in this answer are not correct. I would recommend deletion (if only to avoid enforcing this (probably common) misconception).Granite
Please see my post below where I show, in my opinion conclusively, that the switch statement does a constant time branch.Gabey
Thank you very much for your reply, Brian. Please see Ivan Hamilton's reply ((48259)[beta.stackoverflow.com/questions/44905/#48259]). In short: you are talking about the switch instruction (of the CIL) which is not the same as the switch statement of C#.Granite
I don't believe the compiler generates constant-time branching when switching on strings either.Gruchot
Is this still applicable with pattern matching in switch case statements in C# 7.0?Probabilism
B
120

It's important not to confuse the C# switch statement with the CIL switch instruction.

The CIL switch is a jump table, that requires an index into a set of jump addresses.

This is only useful if the C# switch's cases are adjacent:

case 3: blah; break;
case 4: blah; break;
case 5: blah; break;

But of little use if they aren't:

case 10: blah; break;
case 200: blah; break;
case 3000: blah; break;

(You'd need a table ~3000 entries in size, with only 3 slots used)

With non-adjacent expressions, the compiler may start to perform linear if-else-if-else checks.

With larger non- adjacent expression sets, the compiler may start with a binary tree search, and finally if-else-if-else the last few items.

With expression sets containing clumps of adjacent items, the compiler may binary tree search, and finally a CIL switch.

This is full of "mays" & "mights", and it is dependent on the compiler (may differ with Mono or Rotor).

I replicated your results on my machine using adjacent cases:

total time to execute a 10 way switch, 10000 iterations (ms): 25.1383 approximate time per 10 way switch (ms): 0.00251383

total time to execute a 50 way switch, 10000 iterations (ms): 26.593 approximate time per 50 way switch (ms): 0.0026593

total time to execute a 5000 way switch, 10000 iterations (ms): 23.7094 approximate time per 5000 way switch (ms): 0.00237094

total time to execute a 50000 way switch, 10000 iterations (ms): 20.0933 approximate time per 50000 way switch (ms): 0.00200933

Then I also did using non-adjacent case expressions:

total time to execute a 10 way switch, 10000 iterations (ms): 19.6189 approximate time per 10 way switch (ms): 0.00196189

total time to execute a 500 way switch, 10000 iterations (ms): 19.1664 approximate time per 500 way switch (ms): 0.00191664

total time to execute a 5000 way switch, 10000 iterations (ms): 19.5871 approximate time per 5000 way switch (ms): 0.00195871

A non-adjacent 50,000 case switch statement would not compile. "An expression is too long or complex to compile near 'ConsoleApplication1.Program.Main(string[])'

What's funny here, is that the binary tree search appears a little (probably not statistically) quicker than the CIL switch instruction.

Brian, you've used the word "constant", which has a very definite meaning from a computational complexity theory perspective. While the simplistic adjacent integer example may produce CIL that is considered O(1) (constant), a sparse example is O(log n) (logarithmic), clustered examples lie somewhere in between, and small examples are O(n) (linear).

This doesn't even address the String situation, in which a static Generic.Dictionary<string,int32> may be created, and will suffer definite overhead on first use. Performance here will be dependent on the performance of Generic.Dictionary.

If you check the C# Language Specification (not the CIL spec) you'll find "15.7.2 The switch statement" makes no mention of "constant time" or that the underlying implementation even uses the CIL switch instruction (be very careful of assuming such things).

At the end of the day, a C# switch against an integer expression on a modern system is a sub-microsecond operation, and not normally worth worrying about.


Of course these times will depend on machines and conditions. I wouldn’t pay attention to these timing tests, the microsecond durations we’re talking about are dwarfed by any “real” code being run (and you must include some “real code” otherwise the compiler will optimise the branch away), or jitter in the system. My answers are based on using IL DASM to examine the CIL created by the C# compiler. Of course, this isn’t final, as the actual instructions the CPU runs are then created by the JIT.

I have checked the final CPU instructions actually executed on my x86 machine, and can confirm a simple adjacent set switch doing something like:

  jmp     ds:300025F0[eax*4]

Where a binary tree search is full of:

  cmp     ebx, 79Eh
  jg      3000352B
  cmp     ebx, 654h
  jg      300032BB
  …
  cmp     ebx, 0F82h
  jz      30005EEE
Blunger answered 7/9, 2008 at 8:47 Comment(2)
The results of your experiments surprise me a bit. Did you swap yours with Brian's? His results do show an increase with size while yours don't. I'm a missing something? In any case, thanks for the clear reply.Granite
It's difficult to accurately calcuate timing with such a small operation. We didn't share code, or testing procedures. I can't see why his times should increase for adjacent cases. Mine were 10x faster, so environments & test code may vary greatly.Blunger
G
107

This is my original post, which sparked some debate... because it is wrong:

The switch statement is not the same thing as a big if-else statement. Each case must be unique and evaluated statically. The switch statement does a constant time branch regardless of how many cases you have. The if-else statement evaluates each condition until it finds one that is true.


In fact, the C# switch statement is not always a constant time branch.

In some cases the compiler will use a CIL switch statement which is indeed a constant time branch using a jump table. However, in sparse cases as pointed out by Ivan Hamilton the compiler may generate something else entirely.

This is actually quite easy to verify by writing various C# switch statements, some sparse, some dense, and looking at the resulting CIL with the ildasm.exe tool.

Gabey answered 4/9, 2008 at 22:51 Comment(5)
As noted in other answers (including mine), the claims made in this answer are not correct. I would recommend deletion (if only to avoid enforcing this (probably common) misconception).Granite
Please see my post below where I show, in my opinion conclusively, that the switch statement does a constant time branch.Gabey
Thank you very much for your reply, Brian. Please see Ivan Hamilton's reply ((48259)[beta.stackoverflow.com/questions/44905/#48259]). In short: you are talking about the switch instruction (of the CIL) which is not the same as the switch statement of C#.Granite
I don't believe the compiler generates constant-time branching when switching on strings either.Gruchot
Is this still applicable with pattern matching in switch case statements in C# 7.0?Probabilism
H
25

The first reason that comes to mind is historical:

Since most C, C++, and Java programmers are not accustomed to having such freedoms, they do not demand them.

Another, more valid, reason is that the language complexity would increase:

First of all, should the objects be compared with .Equals() or with the == operator? Both are valid in some cases. Should we introduce new syntax to do this? Should we allow the programmer to introduce their own comparison method?

In addition, allowing to switch on objects would break underlying assumptions about the switch statement. There are two rules governing the switch statement that the compiler would not be able to enforce if objects were allowed to be switched on (see the C# version 3.0 language specification, §8.7.2):

  • That the values of switch labels are constant
  • That the values of switch labels are distinct (so that only one switch block can be selected for a given switch-expression)

Consider this code example in the hypothetical case that non-constant case values were allowed:

void DoIt()
{
    String foo = "bar";
    Switch(foo, foo);
}

void Switch(String val1, String val2)
{
    switch ("bar")
    {
        // The compiler will not know that val1 and val2 are not distinct
        case val1:
            // Is this case block selected?
            break;
        case val2:
            // Or this one?
            break;
        case "bar":
            // Or perhaps this one?
            break;
    }
}

What will the code do? What if the case statements are reordered? Indeed, one of the reasons why C# made switch fall-through illegal is that the switch statements could be arbitrarily rearranged.

These rules are in place for a reason - so that the programmer can, by looking at one case block, know for certain the precise condition under which the block is entered. When the aforementioned switch statement grows into 100 lines or more (and it will), such knowledge is invaluable.

Hardaway answered 5/9, 2008 at 13:11 Comment(1)
Of note on the switch reordering. Fall through is legal if the case contains no code. Such as, Case 1: Case 2: Console.WriteLine("Hi"); break;Mastat
B
10

Mostly, those restrictions are in place because of language designers. The underlying justification may be compatibility with languange history, ideals, or simplification of compiler design.

The compiler may (and does) choose to:

  • create a big if-else statement
  • use a MSIL switch instruction (jump table)
  • build a Generic.Dictionary<string,int32>, populate it on first use, and call Generic.Dictionary<>::TryGetValue() for a index to pass to a MSIL switch instruction (jump table)
  • use a combination of if-elses & MSIL "switch" jumps

The switch statement IS NOT a constant time branch. The compiler may find short-cuts (using hash buckets, etc), but more complicated cases will generate more complicated MSIL code with some cases branching out earlier than others.

To handle the String case, the compiler will end up (at some point) using a.Equals(b) (and possibly a.GetHashCode() ). I think it would be trival for the compiler to use any object that satisfies these constraints.

As for the need for static case expressions... some of those optimisations (hashing, caching, etc) would not be available if the case expressions weren't deterministic. But we've already seen that sometimes the compiler just picks the simplistic if-else-if-else road anyway...

Edit: lomaxx - Your understanding of the "typeof" operator is not correct. The "typeof" operator is used to obtain the System.Type object for a type (nothing to do with its supertypes or interfaces). Checking run-time compatibility of an object with a given type is the "is" operator's job. The use of "typeof" here to express an object is irrelevant.

Blunger answered 5/9, 2008 at 11:33 Comment(0)
S
10

By the way, VB, having the same underlying architecture, allows much more flexible Select Case statements (the above code would work in VB) and still produces efficient code where this is possible so the argument by techical constraint has to be considered carefully.

Stomatal answered 5/9, 2008 at 11:49 Comment(3)
The Select Case en VB is very flexible and a super time saver. I miss it a lot.Dendrology
@EduardoMolteni Switch to F# then. It makes Pascal's and VB's switches seem like idiot children in comparison.Cory
@Cory Would haver to convert a million lines of code and retrain the devs. I'll get right on thatTerranceterrane
B
9

Microsoft finally heard you!

Now with C# 7 you can:

switch(shape)
{
case Circle c:
    WriteLine($"circle with radius {c.Radius}");
    break;
case Rectangle s when (s.Length == s.Height):
    WriteLine($"{s.Length} x {s.Height} square");
    break;
case Rectangle r:
    WriteLine($"{r.Length} x {r.Height} rectangle");
    break;
default:
    WriteLine("<unknown shape>");
    break;
case null:
    throw new ArgumentNullException(nameof(shape));
}
Breger answered 17/2, 2017 at 19:7 Comment(0)
S
6

I don't see any reason why the switch statement has to succomb to static analysis only

True, it doesn't have to, and many languages do in fact use dynamic switch statements. This means however that reordering the "case" clauses can change the behaviour of the code.

There's some interesting info behind the design decisions that went into "switch" in here: Why is the C# switch statement designed to not allow fall-through, but still require a break?

Allowing dynamic case expressions can lead to monstrosities such as this PHP code:

switch (true) {
    case a == 5:
        ...
        break;
    case b == 10:
        ...
        break;
}

which frankly should just use the if-else statement.

Scribble answered 4/6, 2009 at 20:40 Comment(2)
That's what I love about PHP (now as I'm transitioning to C#), it's the freedom. With that comes the freedom to write bad code, but this is something I really miss in C#Thorny
No language can prevent you from writing horrible code. C# is needlessly restrictive in this regard.Crosstie
I
5

While on the topic, according to Jeff Atwood, the switch statement is a programming atrocity. Use them sparingly.

You can often accomplish the same task using a table. For example:

var table = new Dictionary<Type, string>()
{
   { typeof(int), "it's an int!" }
   { typeof(string), "it's a string!" }
};

Type someType = typeof(int);
Console.WriteLine(table[someType]);
Irrecoverable answered 4/9, 2008 at 23:6 Comment(6)
You're seriously quoting someone's off the cuff Twitter post with no evidence? At least link to a a reliable source.Blunger
It is from a reliable source; the Twitter post in question is from Jeff Atwood, author of the site you're looking at. :-) Jeff has a handful of blogs posts on this topic if you're curious.Irrecoverable
I believe that is total BS - whether or not Jeff Atwood wrote it. It's funny how well the switch statement lends itself to handling state machines, and other examples of changing code flow based on the value of an enum type. It's also not a coincidence that intellisense automatically populates a switch statement when you switch on a variable of an enum type.Golub
@JonathonReinhart Yeah, I think that's the point - there's better ways to handle polymorphic code than using the switch statement. He's not saying you shouldn't write state machines, just that you can do the same thing by using nice specific types. Of course, this is much easier in languages like F# which have types that can easily cover quite complex states. For your example, you could use discriminated unions where the state becomes part of the type, and replace the switch with pattern matching. Or use interfaces, for example.Cory
Old answer/question, but I would have thought that (correct me if I'm wrong) Dictionary would have been considerably slower than an optimised switch statement...?Commissary
Ah! Forget my last uneducated comment - this answers my question perfectly.Commissary
T
3

This is not a reason why, but the C# specification section 8.7.2 states the following:

The governing type of a switch statement is established by the switch expression. If the type of the switch expression is sbyte, byte, short, ushort, int, uint, long, ulong, char, string, or an enum-type, then that is the governing type of the switch statement. Otherwise, exactly one user-defined implicit conversion (§6.4) must exist from the type of the switch expression to one of the following possible governing types: sbyte, byte, short, ushort, int, uint, long, ulong, char, string. If no such implicit conversion exists, or if more than one such implicit conversion exists, a compile-time error occurs.

The C# 3.0 specification is located at: http://download.microsoft.com/download/3/8/8/388e7205-bc10-4226-b2a8-75351c669b09/CSharp%20Language%20Specification.doc

Tricia answered 4/9, 2008 at 22:54 Comment(0)
S
3

Judah's answer above gave me an idea. You can "fake" the OP's switch behavior above using a Dictionary<Type, Func<T>:

Dictionary<Type, Func<object, string,  string>> typeTable = new Dictionary<Type, Func<object, string, string>>();
typeTable.Add(typeof(int), (o, s) =>
                    {
                        return string.Format("{0}: {1}", s, o.ToString());
                    });

This allows you to associate behavior with a type in the same style as the switch statement. I believe it has the added benefit of being keyed instead of a switch-style jump table when compiled to IL.

Sherly answered 7/3, 2010 at 16:27 Comment(0)
Q
0

I suppose there is no fundamental reason why the compiler couldn't automatically translate your switch statement into:

if (t == typeof(int))
{
...
}
elseif (t == typeof(string))
{
...
}
...

But there isn't much gained by that.

A case statement on integral types allows the compiler to make a number of optimizations:

  1. There is no duplication (unless you duplicate case labels, which the compiler detects). In your example t could match multiple types due to inheritance. Should the first match be executed? All of them?

  2. The compiler can choose to implement a switch statement over an integral type by a jump table to avoid all the comparisons. If you are switching on an enumeration that has integer values 0 to 100 then it creates an array with 100 pointers in it, one for each switch statement. At runtime it simply looks up the address from the array based on the integer value being switched on. This makes for much better runtime performance than performing 100 comparisons.

Quiz answered 4/9, 2008 at 22:56 Comment(1)
An important complexity to note here is that the .NET memory model has certain strong guarantees that make your pseudocode not precisely equivalent to the (hypothetical, invalid C#) switch (t) { case typeof(int): ... } because your translation implies that variable t must be fetched from memory twice if t != typeof(int), whereas the latter would (putatively) always read the value of t exactly once. This difference can break correctness of concurrent code that relies on those excellent guarantees. For more info on this, see Joe Duffy's Concurrent Programming on WindowsMalvoisie
P
0

According to the switch statement documentation if there is an unambiguous way to implicitly convert the the object to an integral type, then it will be allowed. I think you are expecting a behavior where for each case statement it would be replaced with if (t == typeof(int)), but that would open a whole can of worms when you get to overload that operator. The behavior would change when implementation details for the switch statement changed if you wrote your == override incorrectly. By reducing the comparisons to integral types and string and those things that can be reduced to integral types (and are intended to) they avoid potential issues.

Pun answered 4/9, 2008 at 23:1 Comment(0)
G
0

I have virtually no knowledge of C#, but I suspect that either switch was simply taken as it occurs in other languages without thinking about making it more general or the developer decided that extending it was not worth it.

Strictly speaking you are absolutely right that there is no reason to put these restrictions on it. One might suspect that the reason is that for the allowed cases the implementation is very efficient (as suggested by Brian Ensink (44921)), but I doubt the implementation is very efficient (w.r.t. if-statements) if I use integers and some random cases (e.g. 345, -4574 and 1234203). And in any case, what is the harm in allowing it for everything (or at least more) and saying that it is only efficient for specific cases (such as (almost) consecutive numbers).

I can, however, imagine that one might want to exclude types because of reasons such as the one given by lomaxx (44918).

Edit: @Henk (44970): If Strings are maximally shared, strings with equal content will be pointers to the same memory location as well. Then, if you can make sure that the strings used in the cases are stored consecutively in memory, you can very efficiently implement the switch (i.e. with execution in the order of 2 compares, an addition and two jumps).

Granite answered 4/9, 2008 at 23:16 Comment(0)
F
0

wrote:

"The switch statement does a constant time branch regardless of how many cases you have."

Since the language allows the string type to be used in a switch statement I presume the compiler is unable to generate code for a constant time branch implementation for this type and needs to generate an if-then style.

@mweerden - Ah I see. Thanks.

I do not have a lot of experience in C# and .NET but it seems the language designers do not allow static access to the type system except in narrow circumstances. The typeof keyword returns an object so this is accessible at run-time only.

Fahy answered 4/9, 2008 at 23:28 Comment(0)
U
0

I think Henk nailed it with the "no sttatic access to the type system" thing

Another option is that there is no order to types where as numerics and strings can be. Thus a type switch would can't build a binary search tree, just a linear search.

Upright answered 5/9, 2008 at 3:30 Comment(0)
E
0

I agree with this comment that using a table driven approach is often better.

In C# 1.0 this was not possible because it didn't have generics and anonymous delegates. New versions of C# have the scaffolding to make this work. Having a notation for object literals is also helps.

Elbowroom answered 5/9, 2008 at 4:46 Comment(0)
S
0

C# 8 allows you to solve this problem elegantly and compactly using an switch expression:

public string GetTypeName(object obj)
{
    return obj switch
    {
        int i => "Int32",
        string s => "String",
        { } => "Unknown",
        _ => throw new ArgumentNullException(nameof(obj))
    };
}

As a result, you get:

Console.WriteLine(GetTypeName(obj: 1));           // Int32
Console.WriteLine(GetTypeName(obj: "string"));    // String
Console.WriteLine(GetTypeName(obj: 1.2));         // Unknown
Console.WriteLine(GetTypeName(obj: null));        // System.ArgumentNullException

You can read more about the new feature here.

Sucre answered 1/2, 2020 at 9:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.