Can assembled ASM code result in more than a single possible way (except for offset values)?
Asked Answered
M

1

2

I don't know x86 ASM very well, but I'm rather comfortable with SHARP-z80, and I know by experience that each instruction (mnemonic) has a corresponding byte/word value, and by looking at the hex dump of the assembled binary file I can "read back" the same code I wrote using mnemonics.

In another SO question, somebody claimed that there are some situations where ASM instructions are not translated to their corresponding binary value, but instead are rearranged in a different way by the assembler.

I'm looking especially for cases where disassembling the binary would result in a different ASM code than the original one.

In other words, are there any cases where assembly code is not 1:1 ratio with assembled code?

MikeKwan linked to another question where GCC would modify inline ASM code (in a C project), but, even though that's an interesting topic, it doesn't answer to this question, because GCC is a compiler, and always tries to optimize code and inline ASM trnslation is affected by surrounding C code.

Matrix answered 26/5, 2012 at 10:7 Comment(6)
Are you referring to polymorphic code? i.e. A program that rewrites its own code at runtime?Lederman
@MikeKwan I'm referring to assembling/linking process. What I mean is: can the binary result of assembling some ASM code with two different assemblers or in two different times, be different (not considering addresses and offsets)?Matrix
Something like this? https://mcmap.net/q/1329120/-arm-gcc-inline-assembler-optimization-problemLederman
@MikeKwan That's not exactly what I was looking for (anyway that's an interesting reading, thanks), since GCC is a compiler it tends to behave like a compiler, rather than like an assembler. Someone told me very self-confidently that some ASM instructions are not translated literally by assemblers.Matrix
x86 instructions can often be encoded in several ways (special form that uses the accumulator, immediate can be sign-extended or fully specified, memory operands with only a base register can be given an offset of zero, etc). Also, condition codes have many synonyms.Ineffective
Meh, don't get too worked up about statements made at SO. I doubt you'll get a straight answer, there's always some corner case. But assemblers are most certainly designed to repro original asm when they can. You use one when you're smarter than a machine, it isn't supposed to second-guess you.Sivas
P
4

To the extent that the assembler designers think it was helpful, it may substitute equivalent instructions that have other, useful properties.

First, there machines with variable length value operands fields. If a value/offset will fit into any of several variants, it is common for the assembler to substitute the shortest. (In such assemblers, it is also common to be able force a particular size). This is true of instructions that involved immediate operands and indexed addressing.

Many machines have instructions with PC-relative offsets, commonly for JMPs, sometimes for load/store/arithmetic instructions. An assembler on encountering such an instruction during the first pass can determine of the addressed operand precedes the insruction or it has not seen the instruction yet. If preceding, the assembler can choose a short relative form or a long relative form because it knows the offset. If following, the assembler doesn't know the size, and generally chooses a large offset for the instruction that it fills in during pass2. Similarly, there tend to be ways to force the assembler to choose the short form.

Some machines don't have long jump relative instructions. In this case, the assembler will insert a short jmp relative backwards if the the target precedes the jmp and is close by. If the target precedes but is far away, or the target is a forward reference, the assembler may insert a short-relative-jmp on the opposite branch conditions with target being past the next instruction, followed by a long absolute jmp. (I've personally built assemblers like this). This ensures that jmps can always get to their target.

The good news about these tricks is that if you disassemble, you still get a valid assembly program.

Now lets turn to ones that will confuse your disassembler.

A similar trick to jump relative for literal operands may be used if the machine has short-relative addressing for load/store instructions and the programmer apparently specifies loading of a constant or value a long way away. In this case the assembler changes the instuction to refer to a literal or an address constant following an inserted short relative jmp around that constant. The dissembler thinks everything in the instruction stream is an instruction; in this case, the literal value is not and that would throw the disassembler off. At least there's an unconditional jmp around the literal to guide the disassembler.

Screwier tricks you may find in mature assemblers where every stunt ever imagined is supported. One of my favorites on an 8 bit assemblers were "pseudo" instuctions SKIP1, SKIP2, which you can think of as extremely short relative branches. They were really just the opcoode byte of "CMP #8bits" and "CMP #16bits" instructions, and were used to jump around an 8 bit or 16 bit instruction respectively. So, a "one byte" relative jump rather than two. When you're squeezed for space, every byte counts :-{

      SKIP1
      INC    ; 8 bit instruction
      ...

This was also handy when trying to implement a loop where some step shouldn't be performed on loop entry, but needs to be done on further loop iterations:

      SKIP2
LOOP: SHLD  ; 16 bit instruction
      ...
      BNE LOOP

This issue here is that if you disassemble the SKIP1 or SKIP2 instructions, you won't see the INC (or the corresponding 16 bit instruction).

A trick used by assembly language programmers for passing parameters is to place them inline after the call, with the proviso that the called routine adjust the return address appropriately:

      CALL   foo
      DC     param1
      DC     param2

Or CALL printstring DC "a variable length string",0

There is no practical way that a disassembler can know that such a convention is being used or what that convention is, so the dissembler is bound to handle this wrong.

Pascha answered 26/5, 2012 at 13:19 Comment(9)
First of all, Thanks for you answer. I see, I'm not a fan of these independent assemblers. In SHARP-z80 you have two different instructions for absolute (jp 0xDDEE) or relative (jr 0xBB) jumps. The assembler then throws an error if you're trying to relative-jump to an address that is more than 0x7F bytes far (either backward or forward). And that's how it should be to me, although by your answer I take that x86 has a single mnemonic that can be translated either into a relative or into an absolute jump. It doesn't seem very handy, but it's just me.Matrix
The x86 doesn't have a mnemonic that translates to a relative or absolute jmp. Some assemblers do. MASM tends to use one mnemonic (eg., "MOV") to represent many different opcodes depending on operand type and syntax. The purpose of an assembler is to let you write machine instructions under tight control if that;s what you want; most of the machine instructions you write aren't that critical and where the assembler can step in and make a life a bit easier for the bulk of your code, people have augmented them to do so.Pascha
@NadirSampaoli: Considering only direct near jumps (i.e. that don't change the CS segment register) x86 doesn't have any absolute jumps (unless you put an absolute value in a register and jmp eax). Some assembles let you write jmp SHORT some_label to force a short jump, so you get an assemble-time error if the displacement doesn't fit in a rel8. (Or jmp NEAR some_label.) For conditional branches, 8086 didn't have the rel16 encoding, but IIRC 386 introduced them so 32-bit code can always use jle NEAR some_label if needed.Susa
Haha, consuming a short instruction as immediate data for an opcode outside the loop is a great trick. That's really cool. Will have to keep that in mind for code golf.Susa
@PeterCordes: Then you'd like how I implement fast exception management. Each CALL instruction is followed by a CMP reg,imm32. Normal call sequences execute as CALL, subroutine runs and returns, the CMP is executed having no serious effect. For an exception, the called subroutine picks up the return addres, fetch the imm32 part of the CMP, and adds that as a relative offset to the return address to compute the address of the exception handler in the caller. Obviously, no disassembler understands this convention, so they can't follow exception handling control flow.Pascha
Nice. I guess you used the cmp EAX, imm32 5-byte short encoding (with no modr/m byte). If exceptions are very rare, probably a separate map of exception-return addresses (which you search with ret addr as the key) would be slightly better, but not position-independent, and a global static map wouldn't work across library linking boundaries. So yeah, that is cool. I'd guess that 16 bits would be enough most of the time, you could have used cmp ax, imm16. (Oh, except that's an LCP stall every time it decodes on Intel CPUs). And imm8 would probably require jumping to another jmp...Susa
@PeterCordes: The bottom line is that on a Von Neumann architecture, you can interpret strings of bits as data or code. After that, how you abuse the instruction set as data is open to your imagination.Pascha
Finally found a use for that consume-later-bytes hack in code-golf, where a 01 add opcode becomes a rel8 for jmp rel8, skipping the modrm of that add. Golf a Custom Fibonacci Sequence. (And Google was able to dig up this old question for me so I could comment, since I remembered seeing the idea in one of your answers.)Susa
@PeterCordes: Abusing opcodes for fun and profit. Glad I said something you found useful or at least amusing; your answers sure have been useful for me. Enjoy.Pascha

© 2022 - 2024 — McMap. All rights reserved.