Why is there two sequential move to EAX under optimization build?
Asked Answered
L

1

6

I looked at the ASM code of a release build with all optimizations turned on, and here is one of the inlined function I came across:

0061F854 mov eax,[$00630bec]
0061F859 mov eax,[$00630e3c]
0061F85E mov edx,$00000001
0061F863 mov eax,[eax+edx*4]
0061F866 cmp byte ptr [eax],$01
0061F869 jnz $0061fa83

The code is pretty easy to understand, it builds an offset (1) into a table, compares the byte value from it to 1 and do a jump if NZ. I know the pointer to my table is stored in $00630e3c, but I have no idea where $00630bec is coming from.

Why is there two move to eax one after the other? Isn't the first one overwritten by the second one? Can this be a cache optimization thing or am I missing something unbelievably obvious/obscure?

The Delphi code for the above ASM is as follow:

if( TGameSignals.IsSet( EmitParticleSignal ) = True ) then [...]

IsSet() is an inlined class function and calls the inlined IsSet() function of TSignalManager:

class function TGameSignals.IsSet(Signal: PBucketSignal): Boolean;
begin
  Result := FSignalManagerInstance.IsSet( Signal );
end;

The final IsSet of the signal manager is as such:

function TSignalManagerInstance.IsSet( Signal: PBucketSignal ): Boolean;
begin
  Result := Signal.Pending;
end;
Lansquenet answered 26/7, 2017 at 16:35 Comment(8)
You should post the inline C code to go with it for a bit more perspective. Also compiler type would help (MS, GCC, etc.) You should also be able to look into the map file to translate those offsets for even more info (though the C source should help with this.)Enclose
This doesn't look optimized, because the whole thing can be replaced by mov eax,[$00630e3c] cmp byte ptr [eax+4],1 ... unless the edx is reused in later code ... even the cmp as is can probably use cmp [eax],dl to save one byte of machine code (not sure about performance penalty due to partial reg usage). Then again the optimizers don't optimize until perfect code is found, but they have heuristic which tries reasonable amount of permutations and rules application, to finish in some reasonable time, so if the source was complex enough to overwhelm the optimizer, this may be its best.Mickelson
@Michael Dorgan I followed the code, and it's an endless series of inlined functions, much too exhaustive to post, which might be the reason for the lack of optimization as noted by Ped7g. I replaced the function call with the ultimate inlined code directly, and it got rid of the first move, without doing a better optimization job. I also forgot to mention that the compiler is not C/C++ but Delphi's 10.2, which given their track record, might account for this weirdness.Lansquenet
If you won't show us the code, what do you expect us to say. How can we really comment with no context? Of course, it is well known that Embarcadero's compilers don't optimise well, and struggle especially with inlining. We aren't going to be able to help with the codegen. Submit an issue to Quality Portal.Pteridology
@DavidHeffernan I was trying to untangle the inlines to try and post something, but you are correct, this issue is better served by Code Central or an Embarcadero forum. Thanks.Lansquenet
@RudyVelthuis the tag was added after initial comments. About overwhelming optimizer - it holds up even against Delphi, it's just much easier to not expect very good machine code from Pascal, as it never got as much attention and love, as the C++ compilers, obviously (as the language itself is not as versatile and its usage is more focused => less people working on tools). To OP: makes me sort of wonder why do you bother then, as you don't pick Pascal for performance-important code in the first place, and slight imperfection shouldn't bother you for some UI and ordinary app code.Mickelson
@Mickelson I maintain a sizable codebase that I am loathe to attempt to convert at the moment. Knowing Delphi's limitations, I can usually coerce it into doing what I need. Getting a DirectX 11 engine running, given bad/missing external declaractions, compiler quirks, etc, was a challenge but it works now, so I made everyone happy.Lansquenet
I posted the Delphi code, which was very easy to follow.Lansquenet
S
8

My best guess would be that $00630bec is a reference to the class TGameSignals. You can check it by doing

ShowMessage(IntToHex(NativeInt(TGameSignals), 8))

The pre-optimisation code was probably something like this

0061F854 mov eax,[$00630bec] //Move reference to class TGameSignals in EAX
0061F859 mov eax,[eax + $250] //Move Reference to FSignalManagerInstance at offset $250 in class TGameSignals in EAX

the compiler optimised [eax + $250] to [$00630e3c], but didn't realize the previous MOV wasn't required anymore.

I'm not an expert in codegen, so take it with a grain of salt...

On a side note, in delphi, we usually write

if TGameSignals.IsSet( EmitParticleSignal ) then

As it's possible for the following IF to be true

var vBool : Boolean
[...]
vBool := Boolean(10);
if vBool and (vBool <> True) then

Granted, this is not good practice, but no point in comparing to TRUE either.

EDIT: As pointed out by Ped7g, I was wrong. The instruction is

0061F854 mov eax,[$00630bec] 

and not

0061F854 mov eax,$00630bec

So what I wrote didn't really make sense... The first MOV instruction serve to pass the "self" reference for the call to TGameSignals.IsSet. Now, if the function wasn't inline, it would look like this :

mov eax,[$00630bec]
call TGameSignals.IsSet

and then

*TGameSignals.IsSet
mov eax,[$00630e3c]
[...]

The first mov is still pointless, since "Self" isn't used in TGameSignals.IsSet but it is still required to pass "self" to the function. When the routine get inlined, it looks a lot more silly, indeed.

Like mentioned by Arnaud Bouchez, making TGameSignals.IsSet static remove the implicit Self parameter and thus, remove the first MOV operation.

Sparks answered 26/7, 2017 at 18:9 Comment(8)
Note: you can get rid of the unneeded first "mov eax" by defining your class function TGameSignals.IsSet as static.Seessel
@ArnaudBouchez Thanks Arnaud, I'll give it a try!Lansquenet
@Arnaud: that seems to confirm that in the original code, $00630bec is a class reference.Boren
@RudyVelthuis It does not need confirmation, it is the case for sure.Seessel
[$00630bec] + $250 is not $00630e3c every time, only when the value in memory at address $00630bec is equal to $00630bec (because $00630bec + $250 = $00630e3c). So your explanation how that [eax + $250] turned into constant doesn't sound plausible, it looks to me more like just fetching some base value "for sure, because it is commonly needed to manipulate that instance" and then never figuring out it may be omitted completely in this special case.Mickelson
That is obviously a typo... I meant to write [$00630e3c] and not $00630e3c... And [eax + $250] being turned into a constant is totally plausible because the value is known at compile time.Sparks
You didn't understood my comment probably? (or I don't get yours) The first mov eax,[$00630bec] can set eax to anything, like 0x100, then the second mov eax,[eax + $250] would be mov eax,[$100 + $250] => mov eax,[$350], not mov eax,[$00630e3c]. So I believe the two are unrelated.Mickelson
@Ped7g, Yup, you're right. Edited accordingly. ASM is definitely not my native language. ;)Sparks

© 2022 - 2024 — McMap. All rights reserved.