How to Subtract Bytes on One Line in C#

A

3

6

This is really odd. Can anyone explain this?

This code does NOT work:

const byte ASC_OFFSET = 96;
string Upright = "firefly";
byte c7 = (byte)Upright[6] - ASC_OFFSET;
//Cannot implicitly convert type 'int' to 'byte'.

This code also does NOT work:

const byte ASC_OFFSET = 96;
string Upright = "firefly";
byte c7 = (byte)Upright[6] - (byte)ASC_OFFSET;
//Cannot implicitly convert type 'int' to 'byte'.

Yet, putting the subtraction on a separate line works just fine:

const byte ASC_OFFSET = 96;
string Upright = "firefly";
byte c7 = (byte)Upright[6];
c7 -= ASC_OFFSET;

I don't mind putting the statements on separate lines, if I have to... but I have to wonder...

Why?

Analogy answered 2/1, 2011 at 7:28 Comment(0)

C

4

I've noticed this before too. I think it's because the -= operator is predefined for byte types, whereas in the former cases, you're really putting an int inside a byte, which isn't allowed. The reason they did this doesn't necessarily make sense, but it's consistent with the rules, because in the former cases, the compiler can't "peek" at the - operator when doing the assignment.

If you really need to subtract on one line, instead of saying:

byte c7 = (byte)Upright[6] - ASC_OFFSET;

Say:

byte c7 = (byte)(Upright[6] - ASC_OFFSET);

Carsoncarstensz answered 2/1, 2011 at 7:31 Comment(7)

Hey Lambert - You're on a roll tonight! I'm accepting this one, too, although I'm going to keep the code on multiple lines since I would rather have a little unpretty code than to accept the round-trip to int and back. – Analogy 2/1, 2011 at 7:44

Haha thanks! :) Just a note though: AFAIK, the round-trip will happen with every arithmetic operation, whether or not you see it in the code. If I remember correctly, every operation happens at the native size of the CPU, no matter what the data type... you're not really saving anything by using either one. (I'll double-check this though, and if I'm wrong I'll post it here.) – Carsoncarstensz 2/1, 2011 at 7:49

Oooh. Interesting. I'm doing some bit/byte level stuff that gets called thousands of times a second, and I figured that keeping everything as bytes would be faster than working with ints. I'd be interested to know if that's not the case. Thanks! – Analogy 2/1, 2011 at 7:52

Sure. :) Yeah, 32-bit CPUs are best with 32-bit integers, and 64-bit ones are best with 64-bit integers. Converting to bytes may save space, but it won't save any speed. (In general, leave this kind of microoptimization for the end and after you notice an improvement; don't build it directly into the code.) – Carsoncarstensz 2/1, 2011 at 7:58

Hold on. Is that second suggestion valid? Indexing into a string returns a 'char'. Can you subtract ANYTHING from a char without a cast first? Doesn't (Upright[6] - ASC_OFFSET) mean 'Take this char and subtract 96'? I would think that would need to be a byte first. – Analogy 2/1, 2011 at 8:7

The -= operator is not exactly predefined for byte. Rather, the integer subtraction operator is used and then the result is cast to byte for the assignment. – Perimorph 2/1, 2011 at 16:25

I meant it's predefined in the sense that the compiler recognizes it as special, just like in the case of integers, not in the sense that there's a method for it (integer subtraction doesn't use methods either obviously). But thanks for the clarification. – Carsoncarstensz 2/1, 2011 at 16:27

G

7

This is because 1) byte operations result in int (see why here: http://blogs.msdn.com/b/oldnewthing/archive/2004/03/10/87247.aspx) and 2) the following C# code

c7 -= ASC_OFFSET;

will be "magically" compiled behind the scene into

c7 = (byte)(c7 - ASC_OFFSET);

This is explicitely documented in C# specification here: http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-334.pdf

14.14.2 Compound assignment:

An operation of the form x op= y is processed by applying binary operator overload resolution (§14.2.4) as if the operation was written x op y. Then,

• If the return type of the selected operator is implicitly convertible to the type of x, the operation is evaluated as x = x op y, except that x is evaluated only once.

• Otherwise, if the selected operator is a predefined operator, if the return type of the selected operator is explicitly convertible to the type of x, and if y is implicitly convertible to the type of x or the operator is a shift operator, then the operation is evaluated as x = (T)(x op y), where T is the type of x, except that x is evaluated only once.

• Otherwise, the compound assignment is invalid, and a compile-time error occurs

Galba answered 2/1, 2011 at 7:58 Comment(3)

Fantastic, Simon! Some excellent information there. If I hadn't already selected the answer... – Analogy 2/1, 2011 at 8:2

@FlipScript - not my lucky day :-) – Galba 2/1, 2011 at 8:6

WHOA you really deserved that answer... +1 for such a great response. :) – Carsoncarstensz 2/1, 2011 at 8:57

P

5

The reason why your first two samples do not compile is because:

The cast binds "tighter" than the subtraction. That is, '(C)d-e' means '((C)d)-e', not '(C)(d-e)'. The cast operator is higher precedence.
Therefore the type of both operands to the subtraction is byte, regardless of the casts.
The type of the subtraction is int, because there is no subtraction operator defined on bytes.
Therefore, you are assigning an int to a byte without a cast, which is illegal.

There is no subtraction operator on bytes because, well, suppose you have a byte containing 7 and you subtract from it a byte containing 8, do you really want that to be the byte 255? I think most people would want that to be the int -1.

Finally, why on earth are you doing this in bytes in the first place? This doesn't make any sense. Chars aren't bytes in C#; if you want to do arithmetic on chars then why not subtract the char 96 from the char 'y' instead of doing a lossy and dangerous conversion to byte?

Perimorph answered 2/1, 2011 at 16:30 Comment(6)

Hi Eric. Actually, I would think that a subtraction operator for bytes makes perfect sense. Subtracting a byte containing 7 from a byte containing 8 should leave a byte containing 1. Subtracting 8 from 7 should result in an overflow, just like doing the same on a uint16 or uint32 (instead of our "uint8"). I can subtract bytes. The compiler should be able to do the same. – Analogy 2/1, 2011 at 16:59

The byte arithmetic is for quickly computing a hash code for a 14 character string into a 32 bit integer with no collisions. Believe it or not, it is possible (with our data set, anyway). By the way, why do you say that the conversion from char to byte is "lossy and dangerous"? I can't imagine that it is anything but lossless and safe. – Analogy 2/1, 2011 at 17:2

@FlipScript - what's the lossless way to convert 'Ā' to a byte? – Envisage 2/1, 2011 at 17:41

kvb - We don't have any chars like that. – Analogy 2/1, 2011 at 19:27

And how is the compiler supposed to know that? – Perimorph 2/1, 2011 at 23:10

More fundamentally, the problem here is that you suppose bytes to be a representation of numbers upon which it is sensible to do arithmetic. That's actually not a very productive or useful way to think about bytes. Rather than interpreting a byte as a quantity, it is better to think of it merely as a convenient way to store eight bits. If you want to manipulate a number then use a data type designed for that, namely, int. – Perimorph 2/1, 2011 at 23:12

C

4