Printing Hexadecimal Digits with Assembly [duplicate]
Asked Answered
B

4

6

I'm trying to learn NASM assembly, but I seem to be struggling with what seems to simply in high level languages.

All of the textbooks which I am using discuss using strings -- in fact, that seems to be one of their favorite things. Printing hello world, changing from uppercase to lowercase, etc.

However, I'm trying to understand how to increment and print hexadecimal digits in NASM assembly and don't know how to proceed. For instance, if I want to print #1 - n in Hex, how would I do so without the use of C libraries (which all references I have been able to find use)?

My main idea would be to have a variable in the .data section which I would continue to increment. But how do I extract the hexadecimal value from this location? I seem to need to convert it to a string first...?

Any advice or sample code would be appreciated.

Breadth answered 4/10, 2010 at 8:41 Comment(0)
P
10

First write a simple routine which takes a nybble value (0..15) as input and outputs a hex character ('0'..'9','A'..'F').

Next write a routine which takes a byte value as input and then calls the above routine twice to output 2 hex characters, i.e. one for each nybble.

Finally, for an N byte integer you need a routine which calls this second routine N times, once for each byte.

You might find it helpful to express this in pseudo code or an HLL such as C first, then think about how to translate this into asm, e.g.

void print_nybble(uint8_t n)
{
    if (n < 10) // handle '0' .. '9'
        putchar(n + '0');
    else // handle 'A'..'F'
        putchar(n - 10 + 'A');
}

void print_byte(uint8_t n)
{
    print_nybble(n >> 4); // print hi nybble
    print_nybble(n & 15); // print lo nybble
}

void print_int16(uint16_t n)
{
    print_byte(n >> 8); // print hi byte
    print_byte(n & 255); // print lo byte
}
Prefer answered 4/10, 2010 at 8:53 Comment(5)
the C code given above produces "(" when given any value above 0x80 (i.e. if given 0x8F it give "(F")Internalcombustion
@AdrianZhang: It works for me - did you change something ? Can you provide a minimal reproducible example that shows the problem ?Prefer
Never mind, it seems that changing from a uint8_t to a char produces the wrong result. Funny how that happened.Internalcombustion
Scratch that, it was not an unsigned char. Silly me.Internalcombustion
Yes, it’s implementation-defined as to whether plain char is treated as signed or unsigned, so they are best avoided when you need unsigned.Prefer
E
1

Is this a homework assignment?

Bits is bits. Bit, Byte, word, double word, these are hardware terms, something instruction sets/assembler is going to reference. hex, decimal, octal, unsigned, signed, string, character, etc are manifestations of programming languages. Likewise .text, .bss, .data, etc are also manifestations of software tools, the instruction set doesnt care about one address being .data and one being .text, it is the same instruction either way. There are reasons why all of these programming language things exist, very good reasons sometimes, but dont get confused when trying to solve this problem.

To convert from bits to human readable ascii, you first need to know your ascii table, and bitwise operators, and, or, logical shift, arithmetic shift, etc. Plus load and store and other things.

Think mathmatically what it takes to get from some number in a register/memory into ascii hex. Say 0x1234 which is 0b0001001000110100. For a human to read it, yes you need to get it into a string for lack of a better term but you dont necessarily need to store four characters plus a null in adjacent memory locations in order to do something with it. It depends on your output function. Normally character based output entities boil down to a single output_char() of some sort called many times.

You could convert to a string but that is more work, for each ascii character you compute call some sort of single character based output function right then. putchar() is an example of a byte output character type function.

So for binary you want to examine one bit at a time and create a 0x30 or 0x31. For octal, 3 bits at a time and create 0x30 to 0x37. Hex is based on 4 bits at a time.

Hex has the problem that the 16 characters we want to use are not found adjacent to each other in the ascii table. So you use 0x30 to 0x39 for 0 to 9 but 0x41 to 0x46 or 0x61 to 0x66 for A to F depending on your preference or requirements. So for each nybble you might AND with 0xF, compare with 9 and ADD 0x30 or 0x37 (10+0x37 = 0x41, 11+0x37 = 0x42, etc).

Converting from bits in a register to an ascii representation of binary. If the bit in memory was a 1 show a 1 (0x31 ascii) of the bit was a 0 show a 0 (0x30 in ascii).

void showbin ( unsigned char x )
{
    unsigned char ra;

    for(ra=0x80;ra;ra>>=1)
    {
        if(ra&x) output_char(0x31); else output_char(0x30);
    }
}

It may seem logical to use unsigned char above, but unsigned int, depending on the target processor, could produce much better (cleaner/faster) code. but that is another topic

The above could look could look something like this in assembler (intentionally NOT using x86)

 ...
 mov r4,r0
 mov r5,#0x80
top:
 tst r4,r5
 moveq r0,#0x30
 movne r0,#0x31
 bl output_char
 mov r5,r5, lsr #1
 cmp r5,#0
 bne top
 ...

Unrolled is easier to write and going to be a bit faster, the tradeoff is more memory used

 ...
 tst    r4, #0x80
 moveq  r0, #0x30
 movne  r0, #0x31
 bl output_char
 tst    r4, #0x40
 moveq  r0, #0x30
 movne  r0, #0x31
 bl output_char
 tst    r4, #0x20
 moveq  r0, #0x30
 movne  r0, #0x31
 bl output_char
 ...

Say you had 9 bit numbers and wanted to convert to octal. Take three bits at a time (remember humans read left to right so start with the upper bits) and add 0x30 to get 0x30 to 0x37.

...
mov r4,r0
mov r0,r4,lsr #6
and r0,r0,#0x7
add r0,r0,#0x30
bl output_char
mov r0,r4,lsr #3
and r0,r0,#0x7
add r0,r0,#0x30
bl output_char
and r0,r4,#0x7
add r0,r0,#0x30
bl output_char
...

A single (8 bit) byte in hex might look like:

...
mov r4,r0
mov r0,r4,lsr #4
and r0,r0,#0xF
cmp r0,#9
addhi r0,r0,#0x37
addls r0,r0,#0x30
bl output_character
and r0,r4,#0xF
cmp r0,#9
addhi r0,r0,#0x37
addls r0,r0,#0x30
bl output_character
...

Making a loop from 1 to N storing that value in memory and reading it from memory (.data), output in hex:

...
mov r4,#1
str r4,my_variable
...
top:
ldr r4,my_variable
mov r0,r4,lsr #4
and r0,r0,#0xF
cmp r0,#9
addhi r0,r0,#0x37
addls r0,r0,#0x30
bl output_character
and r0,r4,#0xF
cmp r0,#9
addhi r0,r0,#0x37
addls r0,r0,#0x30
bl output_character
...
ldr r4,my_variable
add r4,r4,#1
str r4,my_variable
cmp r4,#7 ;say N is 7
bne top
...
my_variable .word 0

Saving to ram is a bit of a waste if you have enough registers. Although with x86 you can operate directly on memory and dont have to go through registers.

x86 isnt the same as the above (ARM) assembler so it is left as an exercise of the reader to work out the equivalent. The point is, it is the shifting, anding, and adding that matter, break it down into elementary steps and the instructions fall out naturally from there.

Embrangle answered 5/10, 2010 at 6:54 Comment(0)
B
1

Quick and dirty GAS macro

.altmacro

/*
Convert a byte to hex ASCII value.
c: r/m8 byte to be converted
Output: two ASCII characters, is stored in `al:bl`
*/
.macro HEX c
    mov \c, %al
    mov \c, %bl
    shr $4, %al
    HEX_NIBBLE al
    and $0x0F, %bl
    HEX_NIBBLE bl
.endm

/*
Convert the low nibble of a r8 reg to ASCII of 8-bit in-place.
reg: r8 to be converted
Output: stored in reg itself.
*/
.macro HEX_NIBBLE reg
    LOCAL letter, end
    cmp $10, %\reg
    jae letter
    /* 0x30 == '0' */
    add $0x30, %\reg
    jmp end
letter:
    /* 0x57 == 'A' - 10 */
    add $0x57, %\reg
end:
.endm

Usage:

mov $1A, %al
HEX <%al>

<> are used because of .altmacro: Gas altmacro macro with a percent sign in a default parameter fails with "% operator needs absolute expression"

Outcome:

  • %al contains 0x31 , which is '1' in ASCII
  • %bl contains 0x41 , which is 'A' in ASCII

Now you can do whatever you want with %al and %bl, e.g.:

  • loop over multiple bytes and copy them to memory (make sure to allocate twice as much memory as there are bytes)
  • print them with system or BIOS calls
Balkhash answered 24/9, 2015 at 8:2 Comment(0)
C
-1

Intel Syntax. This is from my bootloader but you should be able to get the idea.

print_value_of_CX:

    print_value_of_C_high:

        print_value_of_C_high_high_part:
            MOV AH, CH
            SHR AH, 0x4
            CALL byte_hex_printer

        print_value_of_C_high_low_part:
            MOV AH, CH
            SHL AH, 0x4
            SHR AH, 0x4
            CALL byte_hex_printer

    print_value_of_C_low:

        print_value_of_C_low_high_part:
            MOV AH, CL
            SHR AH, 0x4
            CALL byte_hex_printer

        print_value_of_C_low_low_part:
            MOV AH, CL
            SHL AH, 0x4
            SHR AH, 0x4
            CALL byte_hex_printer

byte_hex_printer:
    CMP AH, 0x00
    JE move_char_for_zero_into_AL_to_print
    CMP AH, 0x01
    JE move_char_for_one_into_AL_to_print
    CMP AH, 0x02
    JE move_char_for_two_into_AL_to_print
    CMP AH, 0x03
    JE move_char_for_three_into_AL_to_print
    CMP AH, 0x04
    JE move_char_for_four_into_AL_to_print
    CMP AH, 0x05
    JE move_char_for_five_into_AL_to_print
    CMP AH, 0x06
    JE move_char_for_six_into_AL_to_print
    CMP AH, 0x07
    JE move_char_for_seven_into_AL_to_print
    CMP AH, 0x08
    JE move_char_for_eight_into_AL_to_print
    CMP AH, 0x09
    JE move_char_for_nine_into_AL_to_print
    CMP AH, 0x0A
    JE move_char_for_A_into_AL_to_print
    CMP AH, 0x0B
    JE move_char_for_B_into_AL_to_print
    CMP AH, 0x0C
    JE move_char_for_C_into_AL_to_print
    CMP AH, 0x0D
    JE move_char_for_D_into_AL_to_print
    CMP AH, 0x0E
    JE move_char_for_E_into_AL_to_print
    CMP AH, 0x0F
    JE move_char_for_F_into_AL_to_print

        move_char_for_zero_into_AL_to_print:
        MOV AL, 0x30
        CALL print_teletype_stringB
        RET
        move_char_for_one_into_AL_to_print:
        MOV AL, 0x31
        CALL print_teletype_stringB
        RET
        move_char_for_two_into_AL_to_print:
        MOV AL, 0x32
        CALL print_teletype_stringB
        RET
        move_char_for_three_into_AL_to_print:
        MOV AL, 0x33
        CALL print_teletype_stringB
        RET
        move_char_for_four_into_AL_to_print:
        MOV AL, 0x34
        CALL print_teletype_stringB
        RET
        move_char_for_five_into_AL_to_print:
        MOV AL, 0x35
        CALL print_teletype_stringB
        RET
        move_char_for_six_into_AL_to_print:
        MOV AL, 0x36
        CALL print_teletype_stringB
        RET
        move_char_for_seven_into_AL_to_print:
        MOV AL, 0x37
        CALL print_teletype_stringB
        RET
        move_char_for_eight_into_AL_to_print:
        MOV AL, 0x38
        CALL print_teletype_stringB
        RET
        move_char_for_nine_into_AL_to_print:
        MOV AL, 0x39
        CALL print_teletype_stringB
        RET
        move_char_for_A_into_AL_to_print:
        MOV AL, 0x41
        CALL print_teletype_stringB
        RET
        move_char_for_B_into_AL_to_print:
        MOV AL, 0x42
        CALL print_teletype_stringB
        RET
        move_char_for_C_into_AL_to_print:
        MOV AL, 0x43
        CALL print_teletype_stringB
        RET
        move_char_for_D_into_AL_to_print:
        MOV AL, 0x44
        CALL print_teletype_stringB
        RET
        move_char_for_E_into_AL_to_print:
        MOV AL, 0x45
        CALL print_teletype_stringB
        RET
        move_char_for_F_into_AL_to_print:
        MOV AL, 0x46
        CALL print_teletype_stringB
        RET
Coprophilia answered 17/4, 2018 at 20:1 Comment(1)
Use a lookup table like a normal person for mapping a contiguous input range to various outputs, not a chain of branches!Frei

© 2022 - 2024 — McMap. All rights reserved.