Are the results of bitwise operations on signed integers defined?
Asked Answered
C

4

72

I know that the behavior of >> on signed integer can be implementation dependent (specifically, when the left operand is negative).

What about the others: ~, >>, &, ^, |? When their operands are signed integers of built-in type (short, int, long, long long), are the results guaranteed to be the same (in terms of bit content) as if their type is unsigned?

Carner answered 25/7, 2012 at 7:5 Comment(4)
There is nothing confusing about ~, & , ^, | since they operate on bit level. Signed/unsigned doesn't matter. >> is usually implemented arithmetic right shift (correctly / 2 number) if signed, and logical right shift (fill with zero) if unsigned. << also doesn't have any confusion, since we will just remove the bit on the left (and fill with zero on the right).Nominee
en.wikipedia.org/wiki/Signed_number_representationsWaterside
Did you mean << rather than >> in your list of "other" operators?Tillich
removed C++ tag - C and C++ are different languages and questions shouldn't be dual tagged unless they're about interoperability or something; the existing answers all address C only.Pitch
B
61

For negative operands, << has undefined behavior and the result of >> is implementation-defined (usually as "arithmetic" right shift). << and >> are conceptually not bitwise operators. They're arithmetic operators equivalent to multiplication or division by the appropriate power of two for the operands on which they're well-defined.

As for the genuine bitwise operators ^, ~, |, and &, they operate on the bit representation of the value in the (possibly promoted) type of the operand. Their results are well-defined for each possible choice of signed representation (twos complement, ones complement, or sign-magnitude) but in the latter two cases it's possible that the result will be a trap representation if the implementation treats the "negative zero" representation as a trap. Personally, I almost always use unsigned expressions with bitwise operators so that the result is 100% well-defined in terms of values rather than representations.

Finally, note that this answer as written may only apply to C. C and C++ are very different languages and while I don't know C++ well, I understand it may differ in some of these areas from C...

Bootblack answered 25/7, 2012 at 7:33 Comment(10)
In the two's complement case ~INT_MAX is allowed to be a trap representation, too.Maegan
Oh yes, I always forget that... Thankfully non-(full-range-twos-complement) implementations don't actually exist.Bootblack
By the way, note that per 6.2.6.1, negative zero, if it exists, is required to behave identically to ordinary zero as an operand to bitwise operations: Where an operator is applied to a value that has more than one object representation, which object representation is used shall not affect the value of the result.Bootblack
That's an interesting point - it seems to imply that ~~INT_MAX == INT_MIN on a sign-magnitude implementation that supports negative zero.Maegan
Yes, it does. :-) It also makes it that much harder to create a conforming sign/magnitude implementation, which can't be a bad thing.. :-)Bootblack
@Maegan @R How can ~INT_MAX on two's complement be a trap? It becomes INT_MIN doesn't it, since min must be -(2^n)?Brummett
@this: C allows a 2s-complement variant with INT_MIN==-INT_MAX. :-(Bootblack
@R.. Yes, but can two's complement allow that? C also defines two's complement as: the sign bit has the value −(2 M ) (two’s complement); I think allowing that min==-max equality is only there to permit one's complement and sign/magnitude.Brummett
@this: See 6.2.6.2 Integer types. At the end of paragraph 2: Which of these applies is implementation-defined, as is whether the value with sign bit 1 and all value bits zero (for the first two), or with sign bit and all value bits 1 (for ones' complement), is a trap representation or a normal value.Bootblack
@R..GitHubSTOPHELPINGICE: For many purposes, having an implementation specify that integer arithmetic operations that overflow would yield ~INT_MAX, and integer arithmetic operations on ~INT_MAX would could reduce the need for error-checking code on intermediate computations. I don't know of any hardware platforms where compilers could implement such semantics efficiently, but they could be a useful option if such support were available.Gailgaile
M
14
  • A left shift << of a negative value has undefined behaviour;
  • A right shift >> of a negative value gives an implementation-defined result;
  • The result of the &, | and ^ operators is defined in terms of the bitwise representation of the values. Three possibilities are allowed for the representation of negative numbers in C: two's complement, ones' complement and sign-magnitude. The method used by the implementation will determine the numerical result when these operators are used on negative values.

Note that the value with sign bit 1 and all value bits zero (for two's complement and sign-magnitude), or with sign bit and all value bits 1 (for ones’ complement) is explicitly allowed to be a trap representation, and in this case if you use arguments to these operators that would generate such a value the behaviour is undefined.

Maegan answered 25/7, 2012 at 7:31 Comment(2)
Sorry if I'm misunderstanding what you wrote. When you said, ...the value with sign bit 1 and all value bits zero (for two's complement...) is explicitly allowed to be a trap representation, it sounds as if you are saying this is potentially an invalid value. But in two's complement, I believe it would represent the most-negative value for that size, well-defined.Storfer
Followup (edit) to previous comment:seeing your comments above, it appears - news to me - that bitwise complement of the largest positive signed integer can be a trap value. Didn't realize that, or 6.2.6.2Storfer
K
4

The bit content will be the same, but the resulting values will still be implementation dependent.

You really shouldn't see the values as signed or unsigned when using bitwise operations, because that is working on a different level.

Using unsigned types saves you from some of this trouble.

Kuibyshev answered 25/7, 2012 at 7:30 Comment(0)
G
3

The C89 Standard defined the behavior of left-shifting signed numbers based upon bit positions. If neither signed nor unsigned types have padding bits, the required behavior for unsigned types, combined with the requirement that positive signed types share the same representation as unsigned types, would imply that the sign bit is immediately to the left of the most significant value bit.

This, in C89, -1<<1 would be -2 on two's-complement implementations which don't have padding bits and -3 on ones'-complement implementations which don't have padding bits. If there are any sign-magnitude implementations without padding bits, -1<<1 would equal 2 on those.

The C99 Standard changed left-shifts of negative values to Undefined Behavior, but nothing in the rationale gives any clue as to why (or even mentions the change at all). The behavior required by C89 may have been less than ideal in some ones'-complement implementations, and so it would made sense to allow those implementations the freedom to select something better. I've seen no evidence to suggest that the authors of the Standard didn't intended that quality two's-complement implementations should continue to provide the same behavior mandated by C89, but unfortunately they didn't actually say so.

Gailgaile answered 26/9, 2017 at 21:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.