Why can't your switch statement data type be long, Java?

Asked 20/4, 2010 at 15:7 Answered 25/3, 2024 at 16:17

Solved java switch-statement long-integer language-design

100

Here's an excerpt from Sun's Java tutorials:

A switch works with the byte, short, char, and int primitive data types. It also works with enumerated types (discussed in Classes and Inheritance) and a few special classes that "wrap" certain primitive types: Character, Byte, Short, and Integer (discussed in Simple Data Objects).

There must be a good reason why the long primitive data type is not allowed. Anyone know what it is?

Heidt answered 20/4, 2010 at 15:7 Comment(0)

I think to some extent it was probably an arbitrary decision based on typical use of switch.

A switch can essentially be implemented in two ways (or in principle, a combination): for a small number of cases, or ones whose values are widely dispersed, a switch essentially becomes the equivalent of a series of ifs on a temporary variable (the value being switched on must only be evaluated once). For a moderate number of cases that are more or less consecutive in value, a switch table is used (the TABLESWITCH instruction in Java), whereby the location to jump to is effectively looked up in a table.

Either of these methods could in principle use a long value rather than an integer. But I think it was probably just a practical decision to balance up the complexity of the instruction set and compiler with actual need: the cases where you really need to switch over a long are rare enough that it's acceptable to have to re-write as a series of IF statements, or work round in some other way (if the long values in question are close together, you can in your Java code switch over the int result of subtracting the lowest value).

Declaim answered 20/4, 2010 at 15:21 Comment(7)

I have to agree with the "rarity" argument as I've been developing in Java for awhile and I never came across a situation where I needed/tried to switch on a long until now. – Heidt 22/4, 2010 at 12:56

Dimitris, my reasoning wasn't based on a midunderstanding of thread-safety, just that I don't think the thread-safety argument you put forward holds. – Declaim 23/4, 2010 at 14:30

I might have misread that. Still though, the "whatever the width" you mention is not too precise. If the width is a word, then you know that at least someone actually wrote the value that was read, so it's not going to be an address out of thin air, but the address someone intended to be. Both are problems, but in longs, the problem is certainly bigger, more ways to go wrong. That's all I'm trying to say. – Jecoa 23/4, 2010 at 15:49

Most stupid design decision ever. I have a long which is a bunch of flags, and I have to carry with if...else if...else... Totally ridiculous. – Telly 19/7, 2012 at 11:46

I agree w/ @m0skit0. In my case, someone else wrote an API that is using longs as constants, because that's what they store in the DB. Now I can't use a switch/case because of someone else's poor decision. – Affaire 16/3, 2016 at 19:23

@Affaire If you really want to use a switch statement, there are various ways round this, e.g.: use a hash function to convert the longs to ints, build up a Map<Long, Integer> to map the longs from the DB to int constants, use Long.toString() and then switch over strings (which is supported as of JDK7)... I really wonder if not being able to switch on longs is such a stop-the-world scenario! – Declaim 19/3, 2016 at 12:47

I am not saying it's "end of the world". And, yes, there are ways around it, but they all seem like hacks. My point is that, as developers, we should always try to think past how we think people will use the stuff we build. In this case, someone made a decision to not allow switch/case on long based on the initial thought that "Who would honestly have 2^64 cases". Maybe that's an oversimplifications. Maybe there is some other reason why longs can't be switched that we were just not privy too. To me, it just seems strange to not support it. – Affaire 21/3, 2016 at 1:10

Because they didn't implement the necessary instructions in the bytecode and you really don't want to write that many cases, no matter how "production ready" your code is...

[EDIT: Extracted from comments on this answer, with some additions on background]

To be exact, 2³² is a lot of cases and any program with a method long enough to hold more than that is going to be utterly horrendous! In any language. (The longest function I know of in any code in any language is a little over 6k SLOC – yes, it's a big switch – and it's really unmanageable.) If you're really stuck with having a long where you should have only an int or less, then you've got two real alternatives.

Use some variant on the theme of hash functions to compress the long into an int. The simplest one, only for use when you've got the type wrong, is to just cast! More useful would be to do this:
```
(int) ((x&0xFFFFFFFF) ^ ((x >>> 32) & 0xFFFFFFFF))
```
before switching on the result. You'll have to work out how to transform the cases that you're testing against too. But really, that's still horrible since it doesn't address the real problem of lots of cases.
A much better solution if you're working with very large numbers of cases is to change your design to using a Map<Long,Runnable> or something similar so that you're looking up how to dispatch a particular value. This allows you to separate the cases into multiple files, which is much easier to manage when the case-count gets large, though it does get more complex to organize the registration of the host of implementation classes involved (annotations might help by allowing you to build the registration code automatically).

FWIW, I did this many years ago (we switched to the newly-released J2SE 1.2 part way through the project) when building a custom bytecode engine for simulating massively parallel hardware (no, reusing the JVM would not have been suitable due to the radically different value and execution models involved) and it enormously simplified the code relative to the big switch that the C version of the code was using.

To reiterate the take-home message, wanting to switch on a long is an indication that either you've got the types wrong in your program or that you're building a system with that much variation involved that you should be using classes. Time for a rethink in either case.

Hartung answered 20/4, 2010 at 15:8 Comment(13)

A switch doesn't actually have to be implemented with a TABLESWITCH though, and it's as much about the range of the labels as the number. – Declaim 20/4, 2010 at 15:15

But is the range being switched over really more than 32 bits wide? I've never heard of code that needed that many switch arms. Without that, you can compact the range (e.g., by using some variation on the theme of hash functions) to make something that will work. Or use a Map<Long,Runnable> to solve the problem in a wholly different way. :-) – Hartung 20/4, 2010 at 15:25

Isn't this answer just kind of passing the buck? The question just becomes "why didn't they implement the necessary instructions?" – Gillett 20/4, 2010 at 16:49

@Lord Torgamus: Presumably because its silly. Think about it for a moment: why would anyone, anyone, have code with more than 232 arms in a switch? Wanting to choose over a finite set of that many elements simply points to a mistake in the fundamental design of the program. That people are asking for it indicates merely that **they've got their design wrong. – Hartung 20/4, 2010 at 18:55

BTW, if anyone wants to argue with me further on this, please start by giving a use-case for switching on a long. Otherwise we'll be stuck arguing with hypotheticals forever... – Hartung 20/4, 2010 at 19:4

@Donal, I'm not saying you're wrong; in fact, I'll go ahead and state right now that you are right. My point was that your comment there would have made a better answer than the text you actually submitted as an answer. – Gillett 20/4, 2010 at 19:19

Here's the use case that brought this to my attention. BlackBerry global events are indexed with GUIDs, which are of long type in the BlackBerry platform. I had a handful of events I wanted to create a switch for. I never planned on having that many cases, I just prefer a switch format over "if...else if" statements. – Heidt 22/4, 2010 at 12:48

@Fostah: I wonder why the BB platform uses longs there. Do they really have that many events that they need GUIDs? (Probably not. More likely some developer at BB was a moron.) Anyway, doing a quick-and-dirty hash or a cast to int will probably work. Evil hack, but cheap. You'll have to check that the events going around don't hit your code accidentally, but with GUIDs/UUIDs that's fairly unlikely. (Or use a Map populated with anonymous inner class instances.) – Hartung 22/4, 2010 at 13:2

@Donal, I've just hit this issue as well, using the trove libraries have decided to use a hash map on <long, Object> (not Long) for performance reasons and since I use the map all over the app there are places where long rather than int is needed. Now I also use this for streaming data so when I get my map back I do a switch on the keys and hey presto I can't do it. Now it didnt take me as long to type (int) a few times as it has to type this, but it was a WTF moment, and I expected a better reason as to why it isn't implemented. – Toscano 4/2, 2011 at 18:12

Another possibility is to write your own hash function (e.g., by wrapping the GUID inside its own class) that converts the sparse value space into something more compact (and which is an int). So long as you just use it as a precondition for equality (i.e., value-equal implies hash-equal, not the other way round) then you won't have any real hazards. – Hartung 12/9, 2011 at 12:34

@DonalFellows, Another case is in Android with ListView's. While listViews can have long number of rows, I specifically only need 4. This is a case of me having a long handed to me that indicates which row is being acted on, and I need to do different things based on which row it is. I guess I could cast to int, but it would have made my life easier if I could have just switched on the variable. As it is, I am just using a string of if and else if instead. – Citizenry 2/12, 2011 at 13:36

Hey, and what about a long implementing flags? This is a stupid Java design decision. – Telly 19/7, 2012 at 11:48

"To reiterate the take-home message, wanting to switch on a long is an indication that either you've got the types wrong in your program" - No it is not! Having a long doesn't mean you are going to check all possibilities, just like having an int or a String doesn't mean that either. It means the values you have, which could be a handful, have a large range. You could be checking a few simple cases and falling into default for the rest. Having to do shifts and casts means that you will risk losing data. Bottom line, its a bad Java design decision not a user issue. – Bowrah 16/4, 2019 at 11:20

Because the lookup table index must be 32 bits.

Headache answered 20/4, 2010 at 15:10 Comment(4)

But then again a switch need not be implemented with a lookup table necessarily. – Accounting 20/4, 2010 at 15:14

If that was the case, they could never implement switch for Strings (as they currently plan). – Jecoa 20/4, 2010 at 16:17

@DimitrisAndreou yes we can now switch in Strings :D (for years now :P ) – Ier 3/2, 2018 at 11:43

@DimitrisAndreou yes, they can, with hash codes. – Yancey 26/2, 2021 at 12:7

Its just happened to me that I come across this 12 years old question and I can provide one of the best solutions to this problem i.e. use the latest jdk because long and Long are now supported in switch-case statement. :)

Dropsical answered 6/7, 2022 at 3:25 Comment(2)

since which JDK version? 11 (Debian stable, OpenJDK LTS) certainly doesn’t support it – Behah 26/12, 2022 at 7:42

neither does JDK v17 (LTS Oct 2021) – Olives 21/2, 2023 at 10:0

Documentation of JEP 441 states the following under Future Work headline:

At the moment, pattern switch does not support the primitive types boolean, long, float, and double. Allowing these primitive types would also mean allowing them in instanceof expressions, and aligning primitive type patterns with reference type patterns, which would require considerable additional work. This is left for a possible future JEP.

Rajah answered 25/3, 2024 at 16:17 Comment(0)

-12

A long, in 32bit architectures, is represented by two words. Now, imagine what could happen if due to insufficient synchronization, the execution of the switch statement observes a long with its high 32 bits from one write, and the 32 low ones from another! It could try to go to ....who knows where! Basically somewhere at random. Even if both writes represented valid cases for the switch statement, their funny combination would probably lead neither to the first nor to the second -- or extremely worse, it could lead to another valid, but unrelated case!

At least with an int (or lesser types), no matter how badly you mess up, the switch statement will at least read a value that someone actually wrote, instead of a value "out of thin air".

Of course, I don't know the actual reason (it's been more than 15 years, I haven't been paying attention that long!), but if you realize how unsafe and unpredictable such a construct could be, you'll agree that this is a definitely very good reason not to ever have a switch on longs (and as long -pun intended- there will be 32bit machines, this reason will remain valid).

Jecoa answered 20/4, 2010 at 16:10 Comment(5)

I don't think this follows. The value being switched on needs to be calculated and stored in a register or on the stack. If that value is calculated based on data accessed by multiple threads, this calculation needs to be made thread-safe, whatever the width of the result. But then, once that result is in a register or on the stack, it's only accessed by the switching thread and is safe. – Declaim 20/4, 2010 at 16:22

Neil, your argument is quite confused: "But then, once that result is in a register or on the stack, it's only accessed by the switching thread and is safe". Sure, using that value is thread-safe! But my point is that that value can _already_ be wrong due to synchronization bugs in user code. Using thread-safely a wrong value is less than useful :) This issue can never be eliminated: buggy concurrent code could already have produced the "out of thin air"/wrong long value, which can be subsequently used in the switch, making the switch go to a case address nobody ever specified. – Jecoa 21/4, 2010 at 17:36

Dimitris, maybe there´s something in your argument I'm not understanding. The value switched on could indeed be wrong due to synchronization bugs in user code. But I don't believe there's anything inherent about the switch statement that makes this more likely than in other cases. And thinking it through as best I can, I don't believe that the non-atomicity of hi/low words of long reads/writes to memory is in fact an issue. (Thinking about things antoher way: you could decide that an if comparison on a long was not allowed based on the same argument.) – Declaim 23/4, 2010 at 14:23

While the potential problems with long being represented as two words with no guaranteed atomic writes is a general issue, agreed, in the case of switch it would be an even more pronounced danger. It's like sending an envelope with a message where half the address is from one person, half is from another - the final address could be valid and correspond to a totally random chap who would then receive the envelope and act accordingly. It's one thing reading garbage and producing garbage (like a wrong boolean), but reading garbage and doing random jumps does sound a tad more dangerous to me. – Jecoa 23/4, 2010 at 15:45

I know this is old and commenting on it is kind of moot but I want to underline that your argument applies to if as well and the result would be just as bad: wrong result ~> wrong branch taken. Creating a long if-else-if chain instead of a switch would actually lead to exactly the same result. – Mister 24/1, 2017 at 13:44

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags