x > -1 vs x >= 0, is there a performance difference

F

11

39

I have heard a teacher drop this once, and it has been bugging me ever since. Let's say we want to check if the integer x is bigger than or equal to 0. There are two ways to check this:

if (x > -1){
    //do stuff
}

and

if (x >= 0){
    //do stuff
}

According to this teacher > would be slightly faster then >=. In this case it was Java, but according to him this also applied for C, c++ and other languages. Is there any truth to this statement?

Fondea answered 25/1, 2013 at 11:25 Comment(17)

And the type of x is...? – Piscina 25/1, 2013 at 11:26

Why don't you try to profile it and share your findings? – Confucius 25/1, 2013 at 11:27

... 'the integer x' ? – Salt 25/1, 2013 at 11:27

An integer, uint, long: a variable that can hold a number. – Fondea 25/1, 2013 at 11:27

It depends whether and how the operators > and >= are overloaded. – Brescia 25/1, 2013 at 11:27

Has been asked so many time. use the search function!! – Memorandum 25/1, 2013 at 11:28

@Cheiron: Think about what this means if x is a uint type... – Piscina 25/1, 2013 at 11:29

@GrantThomas: There's more than one integer type. – Piscina 25/1, 2013 at 11:30

@JonSkeet Aren't they called something different? I mean, it wouldn't make sense to be more explicit by saying that 'the (not unsigned) integer x', IMO. Anyway, was just reiterating the specified type in case you missed it - obviously you didn't. – Salt 25/1, 2013 at 11:31

@JonSkeet Very sharp, yes that would be a problem. Lets in the case of Unit state that we want to check if x is bigger then or equal to one. – Fondea 25/1, 2013 at 11:32

The expressions make no sense with unsigned types: the first is never true, and the second always true. – Missioner 25/1, 2013 at 11:33

@Cheiron: That may well change things very significantly. I wouldn't be at all surprised if ">= 1" and ">= 0" have very different levels of support at a processor level. – Piscina 25/1, 2013 at 11:36

@JameKanze: It depends on the particular unsigned type, though. If x was an unsigned short and an int could represent all possible values of an unsigned short then x > -1 would be always true. – Schist 25/1, 2013 at 12:31

If there really is a performance difference, for such simple cases your compiler knows about this anyway. – Unrefined 25/1, 2013 at 15:2

Good one! I've never thought about this until now. – Lefty 26/1, 2013 at 3:25

possible duplicate of Is < faster than <=? – Inappetence 17/7, 2014 at 11:0

unsigned x >= 1 is the same condition as x != 0, and is a special case in asm. (ISAs with condition flags normally have a Zero flag that indicates whether the last ALU operation's result was zero or not. So you could often avoid actually doing a compare or test if x was the result of an ALU operation. Of course, that's already true for x >= 0 and x > -1 signed conditions, where you can just check SF, the sign flag, on 2's complement machines that update a flag based on the MSB of the result.) – Loxodromic 2/11, 2020 at 2:42

T

30

There's no difference in any real-world sense.

Let's take a look at some code generated by various compilers for various targets.

I'm assuming a signed int operation (which seem the intent of the OP)
I've limited by survey to C and to compilers that I have readily at hand (admittedly a pretty small sample - GCC, MSVC and IAR)
basic optimizations enabled (-O2 for GCC, /Ox for MSVC, -Oh for IAR)

using the following module:

void my_puts(char const* s);

void cmp_gt(int x) 
{
    if (x > -1) {
        my_puts("non-negative");
    }
    else {
        my_puts("negative");
    }
}

void cmp_gte(int x) 
{
    if (x >= 0) {
        my_puts("non-negative");
    }
    else {
        my_puts("negative");
    }
}

And here's what each of them produced for the comparison operations:

MSVC 11 targeting ARM:

// if (x > -1) {...
00000        |cmp_gt| PROC
  00000 f1b0 3fff    cmp         r0,#0xFFFFFFFF
  00004 dd05         ble         |$LN2@cmp_gt|


// if (x >= 0) {...
  00024      |cmp_gte| PROC
  00024 2800         cmp         r0,#0
  00026 db05         blt         |$LN2@cmp_gte|

MSVC 11 targeting x64:

// if (x > -1) {...
cmp_gt  PROC
  00000 83 f9 ff     cmp     ecx, -1
  00003 48 8d 0d 00 00                  // speculative load of argument to my_puts()
    00 00        lea     rcx, OFFSET FLAT:$SG1359
  0000a 7f 07        jg  SHORT $LN5@cmp_gt

// if (x >= 0) {...
cmp_gte PROC
  00000 85 c9        test    ecx, ecx
  00002 48 8d 0d 00 00                  // speculative load of argument to my_puts()
    00 00        lea     rcx, OFFSET FLAT:$SG1367
  00009 79 07        jns     SHORT $LN5@cmp_gte

MSVC 11 targeting x86:

// if (x > -1) {...
_cmp_gt PROC
  00000 83 7c 24 04 ff   cmp     DWORD PTR _x$[esp-4], -1
  00005 7e 0d        jle     SHORT $LN2@cmp_gt


// if (x >= 0) {...
_cmp_gte PROC
  00000 83 7c 24 04 00   cmp     DWORD PTR _x$[esp-4], 0
  00005 7c 0d        jl  SHORT $LN2@cmp_gte

GCC 4.6.1 targeting x64

// if (x > -1) {...
cmp_gt:
    .seh_endprologue
    test    ecx, ecx
    js  .L2

// if (x >= 0) {...
cmp_gte:
    .seh_endprologue
    test    ecx, ecx
    js  .L5

GCC 4.6.1 targeting x86:

// if (x > -1) {...
_cmp_gt:
    mov eax, DWORD PTR [esp+4]
    test    eax, eax
    js  L2

// if (x >= 0) {...
_cmp_gte:
    mov edx, DWORD PTR [esp+4]
    test    edx, edx
    js  L5

GCC 4.4.1 targeting ARM:

// if (x > -1) {...
cmp_gt:
    .fnstart
.LFB0:
    cmp r0, #0
    blt .L8

// if (x >= 0) {...
cmp_gte:
    .fnstart
.LFB1:
    cmp r0, #0
    blt .L2

IAR 5.20 targeting an ARM Cortex-M3:

// if (x > -1) {...
cmp_gt:
80B5 PUSH     {R7,LR}
.... LDR.N    R1,??DataTable1  ;; `?<Constant "non-negative">`
0028 CMP      R0,#+0
01D4 BMI.N    ??cmp_gt_0

// if (x >= 0) {...
cmp_gte:
 80B5 PUSH     {R7,LR}
 .... LDR.N    R1,??DataTable1  ;; `?<Constant "non-negative">`
 0028 CMP      R0,#+0
 01D4 BMI.N    ??cmp_gte_0

If you're still with me, here are the differences of any note between evaluating (x > -1) and (x >= 0) that show up:

MSVC targeting ARM uses cmp r0,#0xFFFFFFFF for (x > -1) vs cmp r0,#0 for (x >= 0). The first instruction's opcode is two bytes longer. I suppose that may introduce some additional time, so we'll call this an advantage for (x >= 0)
MSVC targeting x86 uses cmp ecx, -1 for (x > -1) vs test ecx, ecx for (x >= 0). The first instruction's opcode is one byte longer. I suppose that may introduce some additional time, so we'll call this an advantage for (x >= 0)

Note that GCC and IAR generated identical machine code for the two kinds of comparison (with the possible exception of which register was used). So according to this survey, it appears that (x >= 0) has an ever-so-slight chance of being 'faster'. But whatever advantage the minimally shorter opcode byte encoding might have (and I stress might have) will be certainly completely overshadowed by other factors.

I'd be surprised if you found anything different for the jitted output of Java or C#. I doubt you'd find any difference of note even for a very small target like an 8 bit AVR.

In short, don't worry about this micro-optimization. I think my write up here has already spent more time than will be spent by any difference in the performance of these expressions accumulated across all the CPUs executing them in my lifetime. If you have the capability to measure the difference in performance, please apply your efforts to something more important like studying the behavior of sub-atomic particles or something.

Turpeth answered 25/1, 2013 at 21:3 Comment(7)

And what if just before the comparation you need to calcule x?.... For example, the VERY common --x ? – Forkey 25/1, 2013 at 22:45

I wouldn't expect that to have any significant impact on the ability of the compiler to generate equivalent code for the > -1 or >= 0 operations. – Turpeth 25/1, 2013 at 23:7

These code snippets don't really illustrate the fact that the 0-comparison comes for free (on ARM at least) if x has just been calculated immediately prior, whereas the -1 comparison would require an explicit extra instruction. – Domination 28/1, 2013 at 9:45

@GrahamBorland: Note that most of the ARM examples here treated x > -1 exactly the same as x >= 0 (ie., they noticed that the expressions are equivalent). I would expect them to do the same if x were calculated - at the moment I don't have a system to test that assumption on. On the other hand, the MSVC ARM compiler treats them slightly differently, and I'm able to test the MS ARM compiler. It still performs an explicit comparison for both the -1 and the 0 tests if x is calculated (there is still a cmp r3,#0 or cmp r3,#0xffffffff after the calculation is made). – Turpeth 28/1, 2013 at 10:45

@MichaelBurr it actually doesn't surprise me at all that the MS compiler fails to spot this obvious optimization. :) – Domination 28/1, 2013 at 10:54

@GrahamBorland: however, once it's fixed to optimize one, it should optimize both. The point being, there shouldn't be a difference worth worrying about between (x > -1) and (x >= 0). You know - even if the optimization fixed only one of the two comparisons, the added cmp would be pretty close to the last thing any program needed special attention to remove. – Turpeth 28/1, 2013 at 11:7

Let's be honest, if there was REALLY a difference, the compiler would ignore what you write and change it to the quicker one. Kind of like how x % 2 becomes x and 1 – Misconstruction 25/7, 2022 at 17:2

D

31

It is very much dependent on the underlying architecture, but any difference will be minuscule.

If anything, I'd expect (x >= 0) to be slightly faster, as comparison with 0 comes for free on some instruction sets (such as ARM).

Of course, any sensible compiler will choose the best implementation regardless of which variant is in your source.

Domination answered 25/1, 2013 at 11:27 Comment(5)

+1. The fact that 0 is involved is very likely to be as important (or more) than the difference between the two comparison ops themselves (if any). – Jonniejonny 25/1, 2013 at 11:29

@Jonniejonny That's possibly true on some architectures (in which case, I would expect the compiler to make the change itself). On others (such as Intel), the two are exactly identical in time. – Missioner 25/1, 2013 at 11:31

Edited to mention that compilers will choose the best anyway. – Domination 25/1, 2013 at 11:33

Agreed; programmers shouldn't need to worry about this level of detail unless they're programming the architectures. – Windsucking 25/1, 2013 at 11:43

I'd like to add the reason why >= 0 would be faster than > -1. This is due to assembly always comparing to 0. If the second value is not 0, the first value would be added (or subtracted) by the second value, after that possible comparison would be e, lt, le, gt, ge, ne (equals, less than, less than or equals, greater than, greather than or equals, not equals). Of course the added addition/subtraction would require additional cpu cycles. – Henze 26/1, 2013 at 1:9

T

30

There's no difference in any real-world sense.

Let's take a look at some code generated by various compilers for various targets.

I'm assuming a signed int operation (which seem the intent of the OP)
I've limited by survey to C and to compilers that I have readily at hand (admittedly a pretty small sample - GCC, MSVC and IAR)
basic optimizations enabled (-O2 for GCC, /Ox for MSVC, -Oh for IAR)

using the following module:

void my_puts(char const* s);

void cmp_gt(int x) 
{
    if (x > -1) {
        my_puts("non-negative");
    }
    else {
        my_puts("negative");
    }
}

void cmp_gte(int x) 
{
    if (x >= 0) {
        my_puts("non-negative");
    }
    else {
        my_puts("negative");
    }
}

And here's what each of them produced for the comparison operations:

MSVC 11 targeting ARM:

// if (x > -1) {...
00000        |cmp_gt| PROC
  00000 f1b0 3fff    cmp         r0,#0xFFFFFFFF
  00004 dd05         ble         |$LN2@cmp_gt|


// if (x >= 0) {...
  00024      |cmp_gte| PROC
  00024 2800         cmp         r0,#0
  00026 db05         blt         |$LN2@cmp_gte|

MSVC 11 targeting x64:

// if (x > -1) {...
cmp_gt  PROC
  00000 83 f9 ff     cmp     ecx, -1
  00003 48 8d 0d 00 00                  // speculative load of argument to my_puts()
    00 00        lea     rcx, OFFSET FLAT:$SG1359
  0000a 7f 07        jg  SHORT $LN5@cmp_gt

// if (x >= 0) {...
cmp_gte PROC
  00000 85 c9        test    ecx, ecx
  00002 48 8d 0d 00 00                  // speculative load of argument to my_puts()
    00 00        lea     rcx, OFFSET FLAT:$SG1367
  00009 79 07        jns     SHORT $LN5@cmp_gte

MSVC 11 targeting x86:

// if (x > -1) {...
_cmp_gt PROC
  00000 83 7c 24 04 ff   cmp     DWORD PTR _x$[esp-4], -1
  00005 7e 0d        jle     SHORT $LN2@cmp_gt


// if (x >= 0) {...
_cmp_gte PROC
  00000 83 7c 24 04 00   cmp     DWORD PTR _x$[esp-4], 0
  00005 7c 0d        jl  SHORT $LN2@cmp_gte

GCC 4.6.1 targeting x64

// if (x > -1) {...
cmp_gt:
    .seh_endprologue
    test    ecx, ecx
    js  .L2

// if (x >= 0) {...
cmp_gte:
    .seh_endprologue
    test    ecx, ecx
    js  .L5

GCC 4.6.1 targeting x86:

// if (x > -1) {...
_cmp_gt:
    mov eax, DWORD PTR [esp+4]
    test    eax, eax
    js  L2

// if (x >= 0) {...
_cmp_gte:
    mov edx, DWORD PTR [esp+4]
    test    edx, edx
    js  L5

GCC 4.4.1 targeting ARM:

// if (x > -1) {...
cmp_gt:
    .fnstart
.LFB0:
    cmp r0, #0
    blt .L8

// if (x >= 0) {...
cmp_gte:
    .fnstart
.LFB1:
    cmp r0, #0
    blt .L2

IAR 5.20 targeting an ARM Cortex-M3:

// if (x > -1) {...
cmp_gt:
80B5 PUSH     {R7,LR}
.... LDR.N    R1,??DataTable1  ;; `?<Constant "non-negative">`
0028 CMP      R0,#+0
01D4 BMI.N    ??cmp_gt_0

// if (x >= 0) {...
cmp_gte:
 80B5 PUSH     {R7,LR}
 .... LDR.N    R1,??DataTable1  ;; `?<Constant "non-negative">`
 0028 CMP      R0,#+0
 01D4 BMI.N    ??cmp_gte_0

If you're still with me, here are the differences of any note between evaluating (x > -1) and (x >= 0) that show up:

MSVC targeting ARM uses cmp r0,#0xFFFFFFFF for (x > -1) vs cmp r0,#0 for (x >= 0). The first instruction's opcode is two bytes longer. I suppose that may introduce some additional time, so we'll call this an advantage for (x >= 0)
MSVC targeting x86 uses cmp ecx, -1 for (x > -1) vs test ecx, ecx for (x >= 0). The first instruction's opcode is one byte longer. I suppose that may introduce some additional time, so we'll call this an advantage for (x >= 0)

Note that GCC and IAR generated identical machine code for the two kinds of comparison (with the possible exception of which register was used). So according to this survey, it appears that (x >= 0) has an ever-so-slight chance of being 'faster'. But whatever advantage the minimally shorter opcode byte encoding might have (and I stress might have) will be certainly completely overshadowed by other factors.

I'd be surprised if you found anything different for the jitted output of Java or C#. I doubt you'd find any difference of note even for a very small target like an 8 bit AVR.

In short, don't worry about this micro-optimization. I think my write up here has already spent more time than will be spent by any difference in the performance of these expressions accumulated across all the CPUs executing them in my lifetime. If you have the capability to measure the difference in performance, please apply your efforts to something more important like studying the behavior of sub-atomic particles or something.

Turpeth answered 25/1, 2013 at 21:3 Comment(7)

And what if just before the comparation you need to calcule x?.... For example, the VERY common --x ? – Forkey 25/1, 2013 at 22:45

I wouldn't expect that to have any significant impact on the ability of the compiler to generate equivalent code for the > -1 or >= 0 operations. – Turpeth 25/1, 2013 at 23:7

These code snippets don't really illustrate the fact that the 0-comparison comes for free (on ARM at least) if x has just been calculated immediately prior, whereas the -1 comparison would require an explicit extra instruction. – Domination 28/1, 2013 at 9:45

@GrahamBorland: Note that most of the ARM examples here treated x > -1 exactly the same as x >= 0 (ie., they noticed that the expressions are equivalent). I would expect them to do the same if x were calculated - at the moment I don't have a system to test that assumption on. On the other hand, the MSVC ARM compiler treats them slightly differently, and I'm able to test the MS ARM compiler. It still performs an explicit comparison for both the -1 and the 0 tests if x is calculated (there is still a cmp r3,#0 or cmp r3,#0xffffffff after the calculation is made). – Turpeth 28/1, 2013 at 10:45

@MichaelBurr it actually doesn't surprise me at all that the MS compiler fails to spot this obvious optimization. :) – Domination 28/1, 2013 at 10:54

@GrahamBorland: however, once it's fixed to optimize one, it should optimize both. The point being, there shouldn't be a difference worth worrying about between (x > -1) and (x >= 0). You know - even if the optimization fixed only one of the two comparisons, the added cmp would be pretty close to the last thing any program needed special attention to remove. – Turpeth 28/1, 2013 at 11:7

Let's be honest, if there was REALLY a difference, the compiler would ignore what you write and change it to the quicker one. Kind of like how x % 2 becomes x and 1 – Misconstruction 25/7, 2022 at 17:2

G

20

Your teacher has been reading some really old books. It used to be the case with some architectures lacking the greater than or equal instruction that evaluating > required fewer machine cycles than >=, but these platforms are rare these days. I suggest going for readability, and using >= 0.

Greenwood answered 25/1, 2013 at 11:29 Comment(3)

But lets say we have a non PC architecture such as Arduino. Would it make a difference there? – Fondea 25/1, 2013 at 11:35

@Cheiron: And the compiler is a million years old and can not spot the optimization. – Farnese 25/1, 2013 at 11:41

@Fondea Even ATMEL's 8-bit AVRs have the BRGE (branch if greater than or equal) and BRSH (branch if same or higher) instructions, so you'd see no difference. – Greenwood 25/1, 2013 at 11:43

W

14

A bigger concern here is premature optimisation. Many consider writing readable code more important than writing efficient code [1, 2]. I would apply these optimisations as a last stage in a low level library once the design has been proven to work.

You shouldn't be constantly considering making minuscule optimisations in your code at the cost of readability, since it'll make reading and maintaing the code harder. If these optimisations need to take place, abstract them into lower level functions so you're still left with code that's easier to read for humans.

As a crazy example, consider someone who writes their programs in assembly to someone who's willing to forgo that extra efficiency and use Java for its benefits in design, ease of use and maintainability.

As a side note, if you're using C, perhaps writing a macro which uses the slightly more efficient code is a more feasible solution, since it will achieve efficiency, readability and maintainability more than scattered operations.

And of course the tradeoffs of efficiency and readability depend on your application. If that loop is running 10000 times a second then it's a possibly bottleneck and you may want to invest time in optimising it, but if it's a single statement that's called occasionally it's probably not worth it for the minute gain.

Windsucking answered 25/1, 2013 at 11:36 Comment(0)

V

9

Yes, there is a difference, you should see the bytecode.

for

if (x >= 0) {}

the bytecode is

ILOAD 1
IFLT L1

for

if (x > -1) {}

the bytecode is

ILOAD 1
ICONST_M1
IF_ICMPLE L3

Version 1 is faster because it uses a special zero operand operation

iflt : jump if less than zero

But it is possible to see the difference only running JVM in interpret-only mode java -Xint ..., eg this Test

int n = 0;       
for (;;) {
    long t0 = System.currentTimeMillis();
    int j = 0;
    for (int i = 100000000; i >= n; i--) {
        j++;
    }
    System.out.println(System.currentTimeMillis() - t0);
}

shows 690 ms for n = 0 and 760 ms for n = 1. (I used 1 instead of -1 because it's easier to demonstrate, the idea stays the same)

Vat answered 25/1, 2013 at 11:35 Comment(5)

Did you turn on optimizations? Will the JIT not optimize it away? – Farnese 25/1, 2013 at 11:43

Wow, the teacher was wrong on the "which one is faster", too :) – Greenwood 25/1, 2013 at 11:47

for(int x = 10000000; x >= 0; x--) { }<-- this test wont work. Random noises will be longer than difference. – Favouritism 25/1, 2013 at 11:55

try my test with java -Xint Test, it works and shows some difference – Vat 25/1, 2013 at 12:10

Please, repeat the test hard coding the 0 and the 1, but not throw the variable n. – Forkey 25/1, 2013 at 19:6

J

4

In fact I believe the second version should be slightly faster as it requires a single bit check(assuming you compare at zero as you show above). However such optimizations never really show as most compilers will optimize such calls.

Jews answered 25/1, 2013 at 11:27 Comment(0)

F

3

">=" is single operation, just like ">". Not 2 separate operations with OR.

But >=0 is probably faster, because computer need to check only one bit (negative sign).

Favouritism answered 25/1, 2013 at 11:35 Comment(2)

We would also have to see how x gets its value (data flow analysis). The compiler might already know the result without checking anything. – Sherfield 25/1, 2013 at 11:52

If your compiler is dumb and fails to optimize x > -1 into something the machine can do efficiently, yes >= 0 can be faster on some ISAs (like MIPS where there's a bgez $reg, target instruction that as you say branches on the sign bit of a register). Being faster allows clever hardware design for MIPS internals, but doesn't make comparison itself faster for software. All simple instructions have 1 cycle latency, whether it's or (independent bits) or add. – Loxodromic 23/12, 2020 at 2:30

F

1

According to this teacher > would be slightly faster then >=. In this case it was Java, but according to him this also applied for C, c++ and other languages. Is there any truth to this statement?

Your teacher is fundamentally wrong. Not only why chance are than comparing with 0 can be sligly fast, but because this sort of local optimization are well done by your compiler / interpreter, and you can mess all trying to help. Definitively not a good thing to teach.

You can read: this or this

Forkey answered 25/1, 2013 at 12:12 Comment(0)

F

1

Sorry to barge in on this conversation about performance.

Before I digress, let's note that the JVM has special instructions for handling not only zero, but also constants one through three. With this said, it's likely that the ability of the architecture to handle zero is long lost behind more than compiler optimization, but also bytecode to machine code translation and the such.

I remember from my x86 assembler language days that there were instructions in the set for both greater than (ja) and greater than or equal to (jae). You would do one of these:

; x >= 0
mov ax, [x]
mov bx, 0
cmp ax, bx
jae above

; x > -1
mov ax, [x]
mov bx, -1
cmp ax, bx
ja  above

These alternatives take the same amount of time, because the instructions are identical or similar, and they consume a predictable number of clock cycles. See, for example, this. ja and jae may indeed check a different number of arithmetic registers, but that check is dominated by the need for the instruction to take a predictable time. This in turn is needed to keep the CPU architecture manageable.

But I did come here to digress.

The answers before me tend to be pertinent, and also indicative that you're gonna be in the same ballpark insofar as performance is concerned, regardless of which approach you choose.

Which leaves you with choosing based on other criteria. And this is where I wanted to make a note. When testing indices, prefer the tight bound style check, chiefly x >= lowerBound, to the x > lowerBound - 1. The argument is bound to be contrived, but it boils down to readability, as here all else truly is equal.

Since conceptually you're testing against a lower bound, x >= lowerBound is the canonical test that elicits the most adapted cognition from readers of your code. x + 10 > lowerBound + 9, x - lowerBound >= 0, and x > -1 are all roundabout ways to test against a lower bound.

Again, sorry to barge in, but I felt like this was important beyond the academics of things. I always think in these terms and let the compiler worry about the minute optimizations that it thinks it can get out of fiddling with the constants and the strictness of the operators.

Five answered 25/1, 2013 at 19:49 Comment(1)

ja and jae are unsigned above / above-or-equal. All numbers are unsigned >= 0, and all numbers are not > -1U. You want jg and jge. Also note that x86 like most ISAs allows compare with an immediate: cmp ax, 0. Or as an optimization, test ax, ax sets FLAGS identically to a compare against zero, but is shorter. Test whether a register is zero with CMP reg,0 vs OR reg,reg? – Loxodromic 23/12, 2020 at 1:46

S

0

First of all it highly depends on hardware platform. For modern PCs and ARM SoCs difference rely mostly on compiler optimisations. But for CPUs without FPU, signed math would be disaster.

For example simple 8-bit CPUs such as Intel 8008, 8048,8051, Zilog Z80, Motorola 6800 or even modern RISC PIC or Atmel microcontollers do all math via ALU with 8-bit registers and have basically only carry flag bit and z (zero value indicator) flag bits . All serious math is done via libraries, and expression

  BYTE x;
  if (x >= 0)

would definitely win, using JZ or JNZ asm instructions without very costly library calls.

Speechless answered 25/1, 2013 at 23:13 Comment(0)

N

0

It depends on the underlying architecture. The older ARMv6 with Jazzelle are able to execute Java bytecode directly. Otherwise, the bytecode is translated into machine code. Sometimes, the target platform needs to consume additional machine cycles to create the operand -1 or 0, but another may load them as the comparison instruction is decoded. Others, such as OpenRISC defines a register that constantly holds 0, to which comparison can be made. ~~Most likely~~ Rarely, certain platforms will need to load an operand from slower memory. In summary, the speed of the operators is not specified by Java the programming language, and generalizing a specific case defeats the purpose of using the cross-platform programming language.

Normally answered 23/12, 2020 at 0:1 Comment(6)

All non-toy architectures have a way to construct small numbers in registers using just one instruction that doesn't load from memory, usually something like mov reg, 0 with the number as an immediate. Usually this is sign-extended so it works for -1 as well. Or even using it as an immediate operand for a cmp instruction, on machines with flags. Like ARM cmp r1, #-1 / bgt target. Also, even on a toy machine with no mov-immediate, you can subtract a register from itself to zero it. – Loxodromic 23/12, 2020 at 1:49

Also, any decent compiler knows these tricks and will turn x > -1 into x>=0 if that's more efficient. Your answer assumes that the Java source expression will be transliterated directly into machine code without trying to find a more efficient way to do the same thing on the target machine. But anyway, all real-world machines can efficiently compare a value against 0. – Loxodromic 23/12, 2020 at 1:51

Well yeah that's sort of true but I mean it depends on the underlying architecture. If the platform does not execute java bytecode directly it may get translated into machine code anyways. Also subtraction with a register itself is also considered to make comparison with zero slower than if the register directly holds zero or if the machine can compare it directly with zero. Again, it all depends on the platform and the language does not guarantee which operator is faster or slower – Normally 23/12, 2020 at 2:6

It could in theory depend on the ISA, but only if the compiler is dumb and doesn't know this peep-hole optimization. (Plausible for a JIT but I'd want to see an example). And even so, it's not for the reasons you state in your answer: loading a 0 or -1 from data memory is not plausible for a real-world ISA that anyone cares about. (Only for toys like MARIE or LCM, that aren't usable as compiler targets anyway.) If you want to talk about hardware that executes Java bytecode directly, put that in your answer as a plausible real-world special case. – Loxodromic 23/12, 2020 at 2:19

If you want to make a decent case, you could point out that MIPS has special instructions to compare-and-branch against zero, like bgez, but to literally implement x > -1 without doing the simple optimization would require slti $t0, $a0, -1 / bne $t0, $zero, target. Or RISC-V is similar, you'd need a -1 in a register but the zero register is already there. However, most machines with FLAGS / status register of some sort (ARM, PowerPC, x86) need to compare before branching, and compare against immediate 0 or -1 is the same cost on RISCs so the zero reg doesn't help. – Loxodromic 23/12, 2020 at 2:22

True that OpenRISC has a zero flag but sometimes the compiler may want to preserve that for a loop optimization that sets the zero flag and use it in the next loop. The result of compare flag of comparing to r0 can be used instead. – Normally 23/12, 2020 at 2:48

Recommended topics

Hot tags