Delphi/ASM code incompatible with 64bit?
Asked Answered
I

2

5

I have some sample source code for OpenGL, I wanted to compile a 64bit version (using Delphi XE2) but there's some ASM code which fails to compile, and I know nothing about ASM. Here's the code below, and I put the two error messages on the lines which fail...

// Copy a pixel from source to dest and Swap the RGB color values
procedure CopySwapPixel(const Source, Destination: Pointer);
asm
  push ebx //[DCC Error]: E2116 Invalid combination of opcode and operands
  mov bl,[eax+0]
  mov bh,[eax+1]
  mov [edx+2],bl
  mov [edx+1],bh
  mov bl,[eax+2]
  mov bh,[eax+3]
  mov [edx+0],bl
  mov [edx+3],bh
  pop ebx //[DCC Error]: E2116 Invalid combination of opcode and operands
end;
Incase answered 22/5, 2012 at 3:11 Comment(7)
You will need to write a 64bit version of your ASM instructions, and use {$IFDEF WIN64} to tell the compiler which set of ASM instructions to use for the given target platform.Demers
Thanks but the key is I know nothing about ASM to know how to write it.Incase
Found something here: docwiki.embarcadero.com/RADStudio/en/… - it says that "asm is not supported in 64bit XE2"Incase
@Jerry Dodge I've added pure Pascal versionHewe
@JerryDodge This is not true at all. 64bit XE2 does not support 32 bit x86 asm block, by definition. But 64bit XE2 supports x64 assembler. You can not write asm blocks within functions, but you can write plain functions or methods in asm. The difficult part is handling exceptions and the stack properly.Wheelock
@ArnaudBouchez Thanks for clarifying, as mentioned, I don't even know the first bit to know about Assembly.Incase
@vhanla Thank you, I'm sure that will be valuable to someone, but it's complete Greek to me :-)Incase
H
12

This procedure swaps ABGR byte order to ARGB and vice versa.
In 32bit this code should do all the job:

mov ecx, [eax]  //ABGR from src
bswap ecx       //RGBA  
ror ecx, 8      //ARGB 
mov [edx], ecx  //to dest

The correct code for X64 is

mov ecx, [rcx]  //ABGR from src
bswap ecx       //RGBA  
ror ecx, 8      //ARGB 
mov [rdx], ecx  //to dest

Yet another option - make pure Pascal version, which changes order of bytes in array representation: 0123 to 2103 (swap 0th and 2th bytes).

procedure Swp(const Source, Destination: Pointer);
var
  s, d: PByteArray;
begin
  s := PByteArray(Source);
  d := PByteArray(Destination);
  d[0] := s[2];
  d[1] := s[1];
  d[2] := s[0];
  d[3] := s[3];
end;
Hewe answered 22/5, 2012 at 4:56 Comment(9)
The asm version won't work in 64 bit, as you stated. The best option is to use pascal. For better performance: make this procedure inline and do not use temporary variables, but directly change the signature to Source, Destination: PByteArray. +1 in all cases for the much better x86 asm coding than the awful original asm code (slower than pascal). If my train was not late this morning, I'd have put a similar version (using eax instead of ecx, may be a bit faster). In all cases, best performance will be by unrolling the loop and use SSE2 instructions.Wheelock
PS - the 4 ASM lines above do in fact work, but I went with the pascal version anyway for ease of personal readability and understanding :DIncase
@Arnaud Bouchez BDS2006 compiler doesn't use real temporary variables and makes all the job in registers. But you are right in general. And idea about SSE could be useful, because this transformation is typical for bulk data treatment.Hewe
@JerryDodge Asm code works, but does it produce right result? (I cannot check 64bit)Hewe
It appears to, unless the actual implementation of it doesn't even represent it... Honestly I have no clue what it's really doing, that's what I'm trying to figure out, but the image appears fine.Incase
That asm code can't be right in 64 bits because it truncates the pointers to 32 bits. And it reads from the wrong registers. Just because it works in a simple test does not mean it is correct for all input.Donielle
procedure CopySwapPixel(const Source, Destination: Pointer); asm mov ecx, [Source] //ABGR from src bswap ecx //RGBA ror ecx, 8 //ARGB mov [Destination], ecx //to dest end;Dionysiac
WARNING the asm code proposed by pani is not right, on x64. You'll need a .noframe pseudo op first. With no best speed. A pure pascal + inline would be faster than this! See this link about x64 asm in Delphi XE2.Wheelock
@Arnaud Bouchez OK, removed. I'd better stop modifications without the possibility of testing on x64 system ;)Hewe
B
3

64 bit has different names for pointer registers and it is passed difference. The first four parameters to inline assembler functions are passed via RCX, RDX, R8, and R9 respectively

EBX -> RBX
EAX -> RAX
EDX -> RDX

try this

procedure CopySwapPixel(const Source, Destination: Pointer);
{$IFDEF CPUX64}
asm
  mov al,[rcx+0]
  mov ah,[rcx+1]
  mov [rdx+2],al
  mov [rdx+1],ah
  mov al,[rcx+2]
  mov ah,[rcx+3]
  mov [rdx+0],al
  mov [rdx+3],ah
end;
{$ELSE}
asm
  push ebx //[DCC Error]: E2116 Invalid combination of opcode and operands
  mov bl,[eax+0]
  mov bh,[eax+1]
  mov [edx+2],bl
  mov [edx+1],bh
  mov bl,[eax+2]
  mov bh,[eax+3]
  mov [edx+0],bl
  mov [edx+3],bh
  pop ebx //[DCC Error]: E2116 Invalid combination of opcode and operands
end;
{$ENDIF}
Bickerstaff answered 22/5, 2012 at 14:37 Comment(5)
I suspect the x64 compiler should not like that asm code. You'll need to specify that this asm procedure has no stack frame needed (a .noframe pseudo compiler instruction is needed at the beginning of the asm...end block). And it won't be faster than pure pascal. So IMHO the pure pascal version is to be recommended. It will also be ARM ready, for your next iPhone (or Android?) application. ;)Wheelock
@arnoud bouchez: if you use only asm..end (so without begin) the stackframe is already omitted. And since arm is big endian there's probably no need to swap at all ;-)Fer
.noframe is just a way to help compiler skip generate stack instructions for passing parameters; it has nothing to do with compile or not compile. For 64 bit, you can not push a 32 bits register to stack same as 32 bits, you can not push 16 register (AX, DX...) to stack. If he change to push RBX, it will compile under 64 bit compiler but those asm codes are not correctBickerstaff
@Fer 1. The stackframe is not ommited, as far as this reference article tells. 2. ARM assembler is completely diverse: this Intel/AMD code won't even compile. 3. Code is working at byte level so here endianness does not impact anything. 4. And this is not an endianess swap here, but a RGBA pixel colors swap.Wheelock
@Bickerstaff You are right about .noframe. See Allen Bauer article. In all cases, using asm in such a sub function is a non sense here: it adds complexity, and will be slower than an inlined pure pascal version. In this context, ASM does make sense only if you explicitly use SSE2 instructions within an unrolled loop. Writing asm code less efficient that the one generated by the compiler does not makes sense to me.Wheelock

© 2022 - 2024 — McMap. All rights reserved.