Understanding behavior of old C++ code
Asked Answered
L

1

8

I am migrating some parts of old C++ code, originally compiled with CodeGear C++Builder® 2009 Version 12.0.3170.16989

The following code - minimal version of a bigger piece - outputs -34 with any modern compiler. Although, in the original platform it outputs 84:

char Key[4];    
Key[0] = 0x1F;
Key[1] = 0x01;
Key[2] = 0x8B;
Key[3] = 0x55;

for(int i = 0; i < 2; i++) {
    Key[i] = Key[2*i] ^ Key[2*i + 1];
}

std::cout << (int) Key[1] << std::endl;

enter image description here The following code outputs -34 with both old and new compilers:

for(int i = 0; i < 2; i++) {
    char a = Key[2*i];
    char b = Key[2*i + 1];
    char c = a ^ b;
    Key[i] = c;
}

Also, manually unrolling the loop seems to work with both compilers:

Key[0] = Key[0] ^ Key[1];
Key[1] = Key[2] ^ Key[3];

It is important that I match the behavior of the old code. Can anyone please help me understand why the original compiler produces those results?

Lanthorn answered 13/8, 2017 at 15:24 Comment(15)
compilation flag signed/unsigned chars ???Brendonbrenk
About the signed / unsigned char: does that explain why the other 2 versions of the code work?Lanthorn
This produces 222 if I change char to unsigned char. (which makes sense since that's the 2's complement usigned value of an 8 bit -34)Heptagonal
yes @BenjaminLindley, I cannot reproduce the original 84 no matter whatLanthorn
Does the result change when you enable/disable optimizations? And do you have any knowledge about what the code was expected to produce?Compote
Still 84 after disabling all optimizations. The code generates a hash. No ground-truth available, other than the hash codes already generated. I should keep generating the hash codes with the same algorithm, but I cannot match the behavior with any modern compiler.Lanthorn
Can you change the code and recompile with the original CodeGear compiler, for the sake of debugging? Because I'm seeing that despite what the code says, it looks like what is being executed is Key[1] = Key[1] ^ Key[3], since that gives a result of 84.Interknit
I expect to be proven wrong the second I submit this comment, but it smells like a compiler bug to me - if this is legal behavior the c++ integer promotion/conversion rules are even more crazy than I thought. Worst case scenario: Create a lookup table that replicates the faulty behavior of the original compiler (and document the shit out of it).Compote
It seems that I am able to reproduce the original behavior with Key[i]=Key[i]^Key[2*i+1] (removing the 2*), as @Interknit suggested, but I really don't understand what's going on here.Lanthorn
Quite possibly a compiler bug. But at least in this case the behaviour can be replicated in new code, lucky you :-)Interknit
That is not really reproducing it, as he changed the original code. Is it possible that any of the code part of the larger piece (not posted) is swizzling them somehow? I also suspect somehow your entries got mixed and you get the 1 ^ 3, which is the 84. have you tried changing the values just to see if it creates the result as well? (e.g change key[3] to 0x54 and see if it gives you 0x53)Bezonian
@Bezonian Using Key[i]=Key[i]^Key[2*i+1] with new compilers, I can reproduce the original behavior of Key[i]=Key[2*i]^Key[2*i+1] compiled with CodeGear. It works with the original length of the Key, which is bigger than the failing minimal example with 4 bytes.Lanthorn
That makes sense, cause you changed the loop - Key[0] ^ Key[1] = 1E (30), then Key[1] ^ Key[3] (2 * 1 + 1), is of course the 0x54, where originally in the second iteration of the loop you would have Key[2*i] which would be Key[2], not Key[1]Bezonian
Try replacing char with signed char and then unsigned char and see if that's what's making the difference.Egarton
I can't upvote this question, since it's not that relevant to the general public, but - I feel your pain. Psychological +1 hang in there.Posthaste
T
5

This seems to be a bug:

The line

Key[i] = Key[2*i] ^ Key[2*i + 1];

generates the following code:

00401184 8B55F8           mov edx,[ebp-$08]
00401187 8A4C55FD         mov cl,[ebp+edx*2-$03]
0040118B 8B5DF8           mov ebx,[ebp-$08]
0040118E 304C1DFC         xor [ebp+ebx-$04],cl

That does not make sense. This is something like:

Key[i] ^= Key[i*2 + 1];

And that explains how the result came to be: 0x01 ^ 0x55 is indeed 0x54, or 84.

It should be something like:

mov edx,[ebp-$08]
mov cl,[ebp+edx*2-$04]
xor cl,[ebp+edx*2-$03]
mov [ebp+ebx-$04],cl

So this is definitely a code generation bug. It seems to persist until now, C++Builder 10.2 Tokyo, for the "classic" (Borland) compiler.

But if I use the "new" (clang) compiler, it produces 222. The code produced is:

File7.cpp.12: Key[i] = Key[2*i] ^ Key[2*i + 1];
004013F5 8B45EC           mov eax,[ebp-$14]
004013F8 C1E001           shl eax,$01
004013FB 0FB64405F0       movzx eax,[ebp+eax-$10]
00401400 8B4DEC           mov ecx,[ebp-$14]
00401403 C1E101           shl ecx,$01
00401406 0FB64C0DF1       movzx ecx,[ebp+ecx-$0f]
0040140B 31C8             xor eax,ecx
0040140D 88C2             mov dl,al
0040140F 8B45EC           mov eax,[ebp-$14]
00401412 885405F0         mov [ebp+eax-$10],dl

That doesn't look optimal to me (I used O2 and O3 with the same result), but it produces the right result.

Terrorist answered 13/8, 2017 at 21:46 Comment(6)
Crazy. Thanks for the detailed answer. It fails with other operators other than ^ too.Lanthorn
Hmmm... I'll look if it is a known bug (if I can -- QC is offline), and otherwise report it to QP.Terrorist
Reported: quality.embarcadero.com/browse/RSP-18831. Used + instead of ^, as it seems to happen for several operators.Terrorist
FWIW, on the Mac (Xcode), almost the same (but 64 bit) code is generated by the clang compiler. I hadn't expected that.Terrorist
Thanks. You mean that clang generates the same not-so-optimized code, not the wrong one, right?Lanthorn
@ibancg: Yes, the clang code looks terribly unoptimized to me. I had expected clang to have a better optimizer. But on the Mac, using Xcode, I get similar generated code.Terrorist

© 2022 - 2024 — McMap. All rights reserved.