Printing an Int (or Int to String)
Asked Answered
H

5

7

I am looking for a way to print an integer in assembler (the compiler I am using is NASM on Linux), however, after doing some research, I have not been able to find a truly viable solution. I was able to find a description for a basic algorithm to serve this purpose, and based on that I developed this code:

global _start

section .bss
digit: resb 16
count: resb 16
i: resb 16

section .data

section .text

_start:
mov             dword[i], 108eh         ; i = 4238
mov             dword[count], 1
L01:
mov             eax, dword[i]
cdq
mov             ecx, 0Ah
div             ecx  
mov             dword[digit], edx

add             dword[digit], 30h       ; add 48 to digit to make it an ASCII char
call            write_digit

inc             dword[count]

mov             eax, dword[i]
cdq
mov             ecx, 0Ah
div             ecx  
mov             dword[i], eax 
cmp             dword[i], 0Ah  
jg              L01

add             dword[i], 48            ; add 48 to i to make it an ASCII char
mov             eax, 4                  ; system call #4 = sys_write
mov             ebx, 1                  ; file descriptor 1 = stdout
mov             ecx, i                  ; store *address* of i into ecx
mov             edx, 16                 ; byte size of 16
int             80h

jmp             exit

exit:
mov             eax, 01h                ; exit()
xor             ebx, ebx                ; errno
int             80h

write_digit:
mov             eax, 4                  ; system call #4 = sys_write
mov             ebx, 1                  ; file descriptor 1 = stdout
mov             ecx, digit              ; store *address* of digit into ecx
mov             edx, 16                 ; byte size of 16
int             80h
ret

C# version of what I want to achieve (for clarity):

static string int2string(int i)
{
    Stack<char> stack = new Stack<char>();
    string s = "";

    do
    {
        stack.Push((char)((i % 10) + 48));
        i = i / 10;
    } while (i > 10);

    stack.Push((char)(i + 48));

    foreach (char c in stack)
    {
        s += c;
    }

    return s;
}

The issue is that it outputs the characters in reverse, so for 4238, the output is 8324. At first, I thought that I could use the x86 stack to solve this problem, push the digits in, and pop them out and print them at the end, however when I tried implementing that feature, it flopped and I could no longer get an output.

As a result, I am a little bit perplexed about how I can implement a stack in to this algorithm in order to accomplish my goal, aka printing an integer. I would also be interested in a simpler/better solution if one is available (as it's one of my first assembler programs).

Hagiology answered 23/11, 2012 at 5:25 Comment(2)
That C# code is horrendous. In general (for all high level languages) there's nice easy to use abstractions (like stack.push()) that exist to prevent people from realising how bad the generated code actually is. Note: I dare you to disassemble the code generated by that C#.. ;-)Lietuva
I agree, I just threw it together in 5 minutes or so to demonstrate what I hope to achieve using assembler.Hagiology
L
8

One approach is to use recursion. In this case you divide the number by 10 (getting a quotient and a remainder) and then call yourself with the quotient as the number to display; and then display the digit corresponding to the remainder.

An example of this would be:

;Input
; eax = number to display

    section .data
const10:    dd 10
    section .text

printNumber:
    push eax
    push edx
    xor edx,edx          ;edx:eax = number
    div dword [const10]  ;eax = quotient, edx = remainder
    test eax,eax         ;Is quotient zero?
    je .l1               ; yes, don't display it
    call printNumber     ;Display the quotient
.l1:
    lea eax,[edx+'0']
    call printCharacter  ;Display the remainder
    pop edx
    pop eax
    ret

Another approach is to avoid recursion by changing the divisor. An example of this would be:

;Input
; eax = number to display

    section .data
divisorTable:
    dd 1000000000
    dd 100000000
    dd 10000000
    dd 1000000
    dd 100000
    dd 10000
    dd 1000
    dd 100
    dd 10
    dd 1
    dd 0
    section .text

printNumber:
    push eax
    push ebx
    push edx
    mov ebx,divisorTable
.nextDigit:
    xor edx,edx          ;edx:eax = number
    div dword [ebx]      ;eax = quotient, edx = remainder
    add eax,'0'
    call printCharacter  ;Display the quotient
    mov eax,edx          ;eax = remainder
    add ebx,4            ;ebx = address of next divisor
    cmp dword [ebx],0    ;Have all divisors been done?
    jne .nextDigit
    pop edx
    pop ebx
    pop eax
    ret

This example doesn't suppress leading zeros, but that would be easy to add.

Lietuva answered 23/11, 2012 at 5:49 Comment(3)
Thank you for your reply, I am a little bit confused about how the printNumber function in your first example works. Firstly, why are you doing a 'xor' before the division? Also, shouldn't the 'test eax,eax je .l1' be a jz (testing for zero). Also, is the character to print stored in eax?Hagiology
xor edx,edx just sets EDX to zero (which is necessary for the division). The je instruction (jump if equal) and the jz instruction (jump if zero) are synonyms (they are exactly the same opcode/instruction). The character to print would be in EAX.Lietuva
The other standard way to print in MSD-first printing order is to store digits into a buffer (on the stack), as in How do I print an integer in Assembly Level Programming without printf from the c library?. Especially when printing a whole string is about as cheap as printing a character (system call or even just stdio function call overhead.)Portfire
A
2

Debugging

cdq
mov             ecx, 0Ah
div             ecx  

Prior to the unsigned division div, you must zero the EDX register. Your test data was a positive number (4238) and so cdq produced EDX=0, but do consider what would happen if your 32-bit integer would have been a negative number. EDX would then have been set to -1 and a division overflow would have occured! My solution below takes into acount negative numbers.

cmp  dword[i], 0Ah  
jg   L01
add  dword[i], 48  ; add 48 to i to make it an ASCII char

This is wrong because you leave the loop too early! If the quotient in EAX is 10 precisely, you can't make a valid ASCII char of it by just adding 48. You must continue the loop for as long as the quotient corresponds to more than 1 decimal digit. Write jge L01.
The C# version had it wrong also ( } while (i > 10);).

mov  edx, 16       ; byte size of 16
int  80h

For each digit it takes but 1 byte to pass to Linux. It makes no sense to specify a size of 16.

Solving

The issue is that it outputs the characters in reverse, so for 4238, the output is 8324.

The algorithm that you chose works from the least significant digit towards the most significant digit. Since you output each digit rightaway, it's only normal that the output is reversed. Better not output while the loop is still running, and store the digits in a buffer. Then print all the digits at once. In next code, I placed the buffer on the stack:

; IN (eax) OUT () MOD (eax,ecx,edx)
PrintInteger32:
  push ebx
  mov  ebx, 10         ; CONST divider
  mov  ecx, esp
  sub  esp, 16         ; Small local buffer

  push eax             ; (1)
  test eax, eax
  jns  .next           ; Is positive [0,2GB-1]
  neg  eax             ; Make positive [1,2GB]
.next:
  dec  ecx
  xor  edx, edx
  div  ebx             ; EDX:EAX / EBX
  add  edx, '0'        ; Remainder [0,9] -> ["0","9"]
  mov  [ecx], dl
  test eax, eax
  jnz  .next
  pop  eax             ; (1)
  test eax, eax
  jns  .print
  dec  ecx
  mov  byte [ecx], '-'

.print:                ; ECX is address of the MSD or the minus char
  lea  edx, [esp + 16]
  sub  edx, ecx        ; -> EDX is number of digits to print
  mov  ebx, 1          ; file descriptor 1 = stdout
  mov  eax, 4          ; system call #4 = sys_write
  int  80h
  add  esp, 16
  pop  ebx
  ret

Going faster

I would also be interested in a simpler/better solution if one is available (as it's one of my first assembler programs).

Not necessarily better and certainly not simpler, but definitely faster, are solutions that replace the division operation by a reciprocal multiplication. See Why does GCC use multiplication by a strange number in implementing integer division?

Attired answered 10/1 at 19:12 Comment(2)
popf is pretty slow, like one per 29 cycles on Alder Lake (uops.info). pushf is less bad but still worse than setz [mem]. Actually your best bet is just push eax to save the original (which you can do before the test, so you don't defeat macro-fusion), and cmp dword [esp], 0 at the end. (Or cmp against a zeroed register like EAX which is zero at that point: that would let it micro- and macro-fuse, but have a data dependency on the division stuff so fast-recovery couldn't have already sorted out a branch mispredict. So pop eax / test eax,eax` would be fine.)Portfire
Or what a compiler would do: save another call-preserved reg like EDI, and keep the original x there, instead of store/reload. Or if it was a stack arg in the first place (unlike your regparm calling convention), reload the original from the stack when it's needed again. 64-bit code would have a spare call-clobbered register.Portfire
T
1

I think that maybe implementing a stack is not the best way to do this (and I really think you could figure out how to do that, saying as how pop is just a mov and a decrement of sp, so you can really set up a stack anywhere you like by just allocating memory for it and setting one of your registers as your new 'stack pointer'). I think this code could be made clearer and more modular if you actually allocated memory for a c-style null delimited string, then create a function to convert the int to string, by the same algorithm you use, then pass the result to another function capable of printing those strings. It will avoid some of the spaghetti code syndrome you are suffering from, and fix your problem to boot. If you want me to demonstrate, just ask, but if you wrote the thing above, I think you can figure out how with the more split up process.

Talishatalisman answered 23/11, 2012 at 5:46 Comment(4)
Thank you for your advice, it is very valuable! I will try and write a function like the one you have described, and if I need more help than I will ask!Hagiology
I got a version working here, and I think that I now know how to use strings to acheive this task, I created this program for learning purposes. However, my question is how would I reverse a string?Hagiology
While there are several ways to reverse a string, I will suggest the following algorithm as an example: 1) Count characters in the string K (call that number n) ;2) Let i=0, j=n-1; 3) While j-i>0; 3.1) Swap K[i] and K[j] ; 3.2) Increment i and decrement j ; 4) Return K;Talishatalisman
You could also use a stack, and that would require you to: 1) Allocate memory for the stack; 2) push the string onto the stack; 3) pop the characters off again;. That would require using extra memory, and the algorithm itself isn't terribly nice, but it's pretty easy to do, basically in the same way you handled the working version of your integer-printer.Talishatalisman
B
1
; Input
; EAX = the int to convert
; EDI = address of the result
; Output:
; None
int_to_string:
    xor   ebx, ebx        ; clear the ebx, I will use as counter for stack pushes
.push_chars:
    xor edx, edx          ; clear edx
    mov ecx, 10           ; ecx is divisor, divide by 10
    div ecx               ; divide edx by ecx, result in eax remainder in edx
    add edx, 0x30         ; add 0x30 to edx convert int => ascii
    push edx              ; push result to stack
    inc ebx               ; increment my stack push counter
    test eax, eax         ; is eax 0?
    jnz .push_chars       ; if eax not 0 repeat

.pop_chars:
    pop eax               ; pop result from stack into eax
    stosb                 ; store contents of eax in at the address of num which is in EDI
    dec ebx               ; decrement my stack push counter
    cmp ebx, 0            ; check if stack push counter is 0
    jg .pop_chars         ; not 0 repeat
    mov eax, 0x0a
    stosb                 ; add line feed
    ret                   ; return to main
Biquadratic answered 31/1, 2018 at 7:17 Comment(0)
A
0
; eax = number to stringify/output
; edi = location of buffer

intToString:
    push  edx
    push  ecx
    push  edi
    push  ebp
    mov   ebp, esp
    mov   ecx, 10

 .pushDigits:
    xor   edx, edx        ; zero-extend eax
    div   ecx             ; divide by 10; now edx = next digit
    add   edx, 30h        ; decimal value + 30h => ascii digit
    push  edx             ; push the whole dword, cause that's how x86 rolls
    test  eax, eax        ; leading zeros suck
    jnz   .pushDigits

 .popDigits:
    pop   eax
    stosb                 ; don't write the whole dword, just the low byte
    cmp   esp, ebp        ; if esp==ebp, we've popped all the digits
    jne   .popDigits

    xor   eax, eax        ; add trailing nul
    stosb

    mov   eax, edi
    pop   ebp
    pop   edi
    pop   ecx
    pop   edx
    sub   eax, edi        ; return number of bytes written
    ret
Audrieaudris answered 23/11, 2012 at 7:57 Comment(4)
This code will not work unless the string happens to be UTF-32. For 32-bit code, "push" stores 32-bit values and not 8-bit values, so the value 123 will end up being something like "1\0\0\02\0\0\03\0\0\0". When displayed as an ASCII or UTF-8 string, those unwanted zeros are string terminators that would mean only 1 digit is displayed.Lietuva
i am clueless in this subject, should I write mov eax, number mov edi, [buffer] before calling the method?Yap
@IsmetAlkan: Depends on what buffer is. If it's the actual block of memory, then you'd want to say mov edi, buffer so that EDI contains the address buffer rather than the first dword of it.Audrieaudris
@IsmetAlkan: Likewise, if number is the label for some memory containing a number, then you'd say mov eax, [buffer] so that EAX contains the contents of that memory. If it's just an equ, on the other hand, what you have is fine.Audrieaudris

© 2022 - 2024 — McMap. All rights reserved.