How to write hello world in assembly under Windows?
Asked Answered
C

9

127

I wanted to write something basic in assembly under Windows. I'm using NASM, but I can't get anything working.

How do I write and compile a hello world program without the help of C functions on Windows?

Canvasback answered 21/6, 2009 at 10:14 Comment(5)
Also check out Steve Gibson's Small Is Beautiful windows assembly starter kit.Goeger
Not using c-libraries is a somewhat strange constraint. One has to call some library within the MS-Windows operatings system. probably kernel32.dll.Whether Microsoft has written this in c or Pascal seems irrelevant. Is it meant that only OS-supplied functions can be called, what in a Unix-type system would be called system calls?Demonolater
With C libraries I assume he or she means without using an C runtime libraries like the ones that come with GCC or MSVC. Of course he or she will have to use some standard Windows DLLs, like kernel32.dll.Blinking
The distinction between kernel32.dll and a gcc runtime library is not in the format (both are dll) and not in the language (both are probably c, but that is hidden.) The difference is between OS-supplied or not.Demonolater
Ive been looking for this also lol couldn't find anything with fasm without includesLamprey
U
43

NASM examples.

Calling libc stdio printf, implementing int main(){ return printf(message); }

; ----------------------------------------------------------------------------
; helloworld.asm
;
; This is a Win32 console program that writes "Hello, World" on one line and
; then exits.  It needs to be linked with a C library.
; ----------------------------------------------------------------------------

    global  _main
    extern  _printf

    section .text
_main:
    push    message
    call    _printf
    add     esp, 4
    ret
message:
    db  'Hello, World', 10, 0

Then run

nasm -fwin32 helloworld.asm
gcc helloworld.obj
a

There's also The Clueless Newbies Guide to Hello World in Nasm without the use of a C library. Then the code would look like this.

16-bit code with MS-DOS system calls: works in DOS emulators or in 32-bit Windows with NTVDM support. Can't be run "directly" (transparently) under any 64-bit Windows, because an x86-64 kernel can't use vm86 mode.

org 100h
mov dx,msg
mov ah,9
int 21h
mov ah,4Ch
int 21h
msg db 'Hello, World!',0Dh,0Ah,'$'

Build this into a .com executable so it will be loaded at cs:100h with all segment registers equal to each other (tiny memory model).

Good luck.

Unprecedented answered 21/6, 2009 at 10:17 Comment(12)
The question explicitly mentions "without using C libraries"Entablement
There's no reliable way to do this without calling a C function at some point. Except if by "C function" you mean "standard C function".Canoewood
Wrong. The C library itself obviously can, so it's possible. It's only slightly harder, in fact. You just need to call WriteConsole() with the right 5 parameters.Glasgow
Although the second example doesn't call any C library function it's not a Windows program either. Virtual DOS Machine will be fired to run it.Maurili
Yeah, the second example is one of the classics. Technically interrupt 0x21 is part of the DOS "API" - the interrupt pointed to a chunk of code installed into memory by DOS at startup, unlike the BIOS mapped interrupts which work even before an OS is loaded. Critical however for the ah=0x09 sub-function is that the string is terminated with $ (otherwise it just starts writing junk from memory). Alternatively you can also use the ah=0x40 function, for which you specify the number of characters (this function is also used to write to files, as the "screen" is just another pipe)Ard
Hey, out of interest, I was wondering why we need to offset the beginning of the program? Does this have to do with how the segments are set up? Like, does NASM place the data segment before the code segment in memory? (i.e. code breaks if org 0x100 set to org 0x00)Annates
@Alex Hart, his second example is for DOS, not for Windows. In DOS, the programs in tiny mode (.COM files, under 64Kb total code+data+stack) start at 0x100h because first 256 bytes in the segment are taken by the PSP (command-line args etc.). See this link: en.wikipedia.org/wiki/Program_Segment_PrefixMultidisciplinary
This is not what was asked for. The first example uses the C library and the second one is MS-DOS, not Windows.Picturize
Does anybody know why to make it work I had to remove the underscores before _main and _printf ? I compiled it with: nasm -f win32 helloworld.asm -o hello.obj and linked it with golink /console /entry main hello.obj MSVCRT.dll. If I did not remove underscores I'd get the following error: Error! The following symbol was not defined in the object file or files:- _printf You may be trying to link object files or lib code with decorated symbols - If so, you could try using the /mix switch Output file not madeBehalf
@OP "How to write and compile hello world without the help of C functions on Windows?"Peloquin
This uses the standard c library which assumes the context of a c-compiler, not assembler. If the poster of this question acccepts this as the best solution, he absolutely must rephrase the question in particular "without the help of c-functions" .Demonolater
The gcc step fails for me with: undefined reference to `_printf'. gcc --version returns realgcc.exe (Rev1, Built by MSYS2 project) 7.2.0Glorify
T
161

This example shows how to go directly to the Windows API and not link in the C Standard Library.

    global _main
    extern  _GetStdHandle@4
    extern  _WriteFile@20
    extern  _ExitProcess@4

    section .text
_main:
    ; DWORD  bytes;    
    mov     ebp, esp
    sub     esp, 4

    ; hStdOut = GetstdHandle( STD_OUTPUT_HANDLE)
    push    -11
    call    _GetStdHandle@4
    mov     ebx, eax    

    ; WriteFile( hstdOut, message, length(message), &bytes, 0);
    push    0
    lea     eax, [ebp-4]
    push    eax
    push    (message_end - message)
    push    message
    push    ebx
    call    _WriteFile@20

    ; ExitProcess(0)
    push    0
    call    _ExitProcess@4

    ; never here
    hlt
message:
    db      'Hello, World', 10
message_end:

To compile, you'll need NASM and LINK.EXE (from Visual studio Standard Edition)

   nasm -fwin32 hello.asm
   link /subsystem:console /nodefaultlib /entry:main hello.obj 
Tenet answered 22/6, 2009 at 19:51 Comment(5)
you likely need to include the kernel32.lib to link this (I did). link /subsystem:console /nodefaultlib /entry:main hello.obj kernel32.libDiscourse
How to do link the obj with ld.exe from MinGW?Fumarole
@Fumarole gcc hello.objSummers
Would this also work using free linkers like Alink from sourceforge.net/projects/alink or GoLink from godevtool.com/#linker ? I don't want to install visual studio only for that?Behalf
Using my version of link : Microsoft (R) Incremental Linker Version 14.29.30133.0, I get unresolved external symbol _GetStdHandle@4 and so for all the extern items?Glorify
U
43

NASM examples.

Calling libc stdio printf, implementing int main(){ return printf(message); }

; ----------------------------------------------------------------------------
; helloworld.asm
;
; This is a Win32 console program that writes "Hello, World" on one line and
; then exits.  It needs to be linked with a C library.
; ----------------------------------------------------------------------------

    global  _main
    extern  _printf

    section .text
_main:
    push    message
    call    _printf
    add     esp, 4
    ret
message:
    db  'Hello, World', 10, 0

Then run

nasm -fwin32 helloworld.asm
gcc helloworld.obj
a

There's also The Clueless Newbies Guide to Hello World in Nasm without the use of a C library. Then the code would look like this.

16-bit code with MS-DOS system calls: works in DOS emulators or in 32-bit Windows with NTVDM support. Can't be run "directly" (transparently) under any 64-bit Windows, because an x86-64 kernel can't use vm86 mode.

org 100h
mov dx,msg
mov ah,9
int 21h
mov ah,4Ch
int 21h
msg db 'Hello, World!',0Dh,0Ah,'$'

Build this into a .com executable so it will be loaded at cs:100h with all segment registers equal to each other (tiny memory model).

Good luck.

Unprecedented answered 21/6, 2009 at 10:17 Comment(12)
The question explicitly mentions "without using C libraries"Entablement
There's no reliable way to do this without calling a C function at some point. Except if by "C function" you mean "standard C function".Canoewood
Wrong. The C library itself obviously can, so it's possible. It's only slightly harder, in fact. You just need to call WriteConsole() with the right 5 parameters.Glasgow
Although the second example doesn't call any C library function it's not a Windows program either. Virtual DOS Machine will be fired to run it.Maurili
Yeah, the second example is one of the classics. Technically interrupt 0x21 is part of the DOS "API" - the interrupt pointed to a chunk of code installed into memory by DOS at startup, unlike the BIOS mapped interrupts which work even before an OS is loaded. Critical however for the ah=0x09 sub-function is that the string is terminated with $ (otherwise it just starts writing junk from memory). Alternatively you can also use the ah=0x40 function, for which you specify the number of characters (this function is also used to write to files, as the "screen" is just another pipe)Ard
Hey, out of interest, I was wondering why we need to offset the beginning of the program? Does this have to do with how the segments are set up? Like, does NASM place the data segment before the code segment in memory? (i.e. code breaks if org 0x100 set to org 0x00)Annates
@Alex Hart, his second example is for DOS, not for Windows. In DOS, the programs in tiny mode (.COM files, under 64Kb total code+data+stack) start at 0x100h because first 256 bytes in the segment are taken by the PSP (command-line args etc.). See this link: en.wikipedia.org/wiki/Program_Segment_PrefixMultidisciplinary
This is not what was asked for. The first example uses the C library and the second one is MS-DOS, not Windows.Picturize
Does anybody know why to make it work I had to remove the underscores before _main and _printf ? I compiled it with: nasm -f win32 helloworld.asm -o hello.obj and linked it with golink /console /entry main hello.obj MSVCRT.dll. If I did not remove underscores I'd get the following error: Error! The following symbol was not defined in the object file or files:- _printf You may be trying to link object files or lib code with decorated symbols - If so, you could try using the /mix switch Output file not madeBehalf
@OP "How to write and compile hello world without the help of C functions on Windows?"Peloquin
This uses the standard c library which assumes the context of a c-compiler, not assembler. If the poster of this question acccepts this as the best solution, he absolutely must rephrase the question in particular "without the help of c-functions" .Demonolater
The gcc step fails for me with: undefined reference to `_printf'. gcc --version returns realgcc.exe (Rev1, Built by MSYS2 project) 7.2.0Glorify
B
33

These are Win32 and Win64 examples using Windows API calls. They are for MASM rather than NASM, but have a look at them. You can find more details in this article.

This uses MessageBox instead of printing to stdout.

Win32 MASM

;---ASM Hello World Win32 MessageBox

.386
.model flat, stdcall
include kernel32.inc
includelib kernel32.lib
include user32.inc
includelib user32.lib

.data
title db 'Win32', 0
msg db 'Hello World', 0

.code

Main:
push 0            ; uType = MB_OK
push offset title ; LPCSTR lpCaption
push offset msg   ; LPCSTR lpText
push 0            ; hWnd = HWND_DESKTOP
call MessageBoxA
push eax          ; uExitCode = MessageBox(...)
call ExitProcess

End Main

Win64 MASM

;---ASM Hello World Win64 MessageBox

extrn MessageBoxA: PROC
extrn ExitProcess: PROC

.data
title db 'Win64', 0
msg db 'Hello World!', 0

.code
main proc
  sub rsp, 28h  
  mov rcx, 0       ; hWnd = HWND_DESKTOP
  lea rdx, msg     ; LPCSTR lpText
  lea r8,  title   ; LPCSTR lpCaption
  mov r9d, 0       ; uType = MB_OK
  call MessageBoxA
  add rsp, 28h  
  mov ecx, eax     ; uExitCode = MessageBox(...)
  call ExitProcess
main endp

End

To assemble and link these using MASM, use this for 32-bit executable:

ml.exe [filename] /link /subsystem:windows 
/defaultlib:kernel32.lib /defaultlib:user32.lib /entry:Main

or this for 64-bit executable:

ml64.exe [filename] /link /subsystem:windows 
/defaultlib:kernel32.lib /defaultlib:user32.lib /entry:main

Why does x64 Windows need to reserve 28h bytes of stack space before a call? That's 32 bytes (0x20) of shadow space aka home space, as required by the calling convention. And another 8 bytes to re-align the stack by 16, because the calling convention requires RSP be 16-byte aligned before a call. (Our main's caller (in the CRT startup code) did that. The 8-byte return address means that RSP is 8 bytes away from a 16-byte boundary on entry to a function.)

Shadow space can be used by a function to dump its register args next to where any stack args (if any) would be. A system call requires 30h (48 bytes) to also reserve space for r10 and r11 in addition to the previously mentioned 4 registers. But DLL calls are just function calls, even if they're wrappers around syscall instructions.

Fun fact: non-Windows, i.e. the x86-64 System V calling convention (e.g. on Linux) doesn't use shadow space at all, and uses up to 6 integer/pointer register args, and up to 8 FP args in XMM registers.


Using MASM's invoke directive (which knows the calling convention), you can use one ifdef to make a version of this which can be built as 32-bit or 64-bit.

ifdef rax
    extrn MessageBoxA: PROC
    extrn ExitProcess: PROC
else
    .386
    .model flat, stdcall
    include kernel32.inc
    includelib kernel32.lib
    include user32.inc
    includelib user32.lib
endif
.data
caption db 'WinAPI', 0
text    db 'Hello World', 0
.code
main proc
    invoke MessageBoxA, 0, offset text, offset caption, 0
    invoke ExitProcess, eax
main endp
end

The macro variant is the same for both, but you won't learn assembly this way. You'll learn C-style asm instead. invoke is for stdcall or fastcall while cinvoke is for cdecl or variable argument fastcall. The assembler knows which to use.

You can disassemble the output to see how invoke expanded.

Bibulous answered 23/6, 2009 at 12:58 Comment(9)
+1 for your answer. Can you please add assembly code for Windows on ARM (WOA) too?Hedwighedwiga
Why does rsp require 0x28 bytes and not 0x20? All the references on the calling convention say that it should be 32 but it seems to require 40 in practice.Hasen
In your 32-bit message box code, for some reason when I use title as label name, I run into errors. However when I use something else as label name like mytitle, everything works fine.Abstract
how to do it withoiut includes?Lamprey
@Hasen It's a little confusing, but that's because the a) the stack alignment needs to be maintained at 16, and b) the return address is pushed by call. So adding 0x20 is for the shadow, +8 for return address, +8 to maintain alignment.Antihistamine
@Bibulous just curious is there any learning resource for MASM, that used ml.exe / ml64.exe? Because the precious little learning material online says that we've got to use masm sdk which for some reason doesn't really agree with my machine :( .Spoilfive
The MASM64 example gives a syntax error, it seems title is a directive: learn.microsoft.com/en-us/cpp/assembler/masm/… Using another name works fineHomonym
After ExitProcess an exception gets thrown saying "Access violation reading location 0xFFFFFFFFFFFFFFFFF on the x64 versionTit
I use 64bit MASM assembler ml64.exe(VS2022) to assemble the pasted code (the "invoke" one), it gives error: error A2008:syntax error : MessageBoxA and error A2008:syntax error : ExitProcess.Insomniac
T
16

Flat Assembler does not need an extra linker. This makes assembler programming quite easy. It is also available for Linux.

This is hello.asm from the Fasm examples:

include 'win32ax.inc'

.code

  start:
    invoke  MessageBox,HWND_DESKTOP,"Hi! I'm the example program!",invoke GetCommandLine,MB_OK
    invoke  ExitProcess,0

.end start

Fasm creates an executable:

>fasm hello.asm
flat assembler  version 1.70.03  (1048575 kilobytes memory)
4 passes, 1536 bytes.

And this is the program in IDA:

enter image description here

You can see the three calls: GetCommandLine, MessageBox and ExitProcess.

Talich answered 17/11, 2013 at 15:54 Comment(3)
this uses an include and GUI how do we do it just to CMD with no includes at all?Lamprey
Tried reading the manual? flatassembler.net/docs.php?article=manual#2.4.2Talich
can you point me to a section that writes to console without any dlls?Lamprey
M
16

To get an .exe with NASM as the assembler and Visual Studio's linker this code works fine:

default rel         ; Use RIP-relative addressing like [rel msg] by default
global WinMain
extern ExitProcess  ; external functions in system libraries 
extern MessageBoxA

section .data 
title:  db 'Win64', 0
msg:    db 'Hello world!', 0

section .text
WinMain:
    sub rsp, 28h      ; reserve shadow space and make RSP%16 == 0
    mov rcx, 0       ; hWnd = HWND_DESKTOP
    lea rdx,[msg]    ; LPCSTR lpText
    lea r8,[title]   ; LPCSTR lpCaption
    mov r9d, 0       ; uType = MB_OK
    call MessageBoxA

    mov  ecx,eax        ; exit status = return value of MessageBoxA
    call ExitProcess

    add rsp, 28h       ; if you were going to ret, restore RSP

    hlt     ; privileged instruction that crashes if ever reached.

If this code is saved as test64.asm, then to assemble:

nasm -f win64 test64.asm

Produces test64.obj Then to link from command prompt:

path_to_link\link.exe test64.obj /subsystem:windows /entry:WinMain  /libpath:path_to_libs /nodefaultlib kernel32.lib user32.lib /largeaddressaware:no

where path_to_link could be C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin or wherever is your link.exe program in your machine, path_to_libs could be C:\Program Files (x86)\Windows Kits\8.1\Lib\winv6.3\um\x64 or wherever are your libraries (in this case both kernel32.lib and user32.lib are on the same place, otherwise use one option for each path you need) and the /largeaddressaware:no option is necessary to avoid linker's complain about addresses to long (for user32.lib in this case). Also, as it is done here, if Visual's linker is invoked from command prompt, it is necessary to setup the environment previously (run once vcvarsall.bat and/or see MS C++ 2010 and mspdb100.dll).

(Using default rel makes the lea instructions work from anywhere, including outside the low 2GiB of virtual address space. But the call MessageBoxA is still a direct call rel32 that can only reach instructions +-2GiB away from itself.)

Meri answered 16/1, 2015 at 16:53 Comment(2)
I highly recommend using default rel at the top of your file so those addressing modes ([msg] and [title]) use RIP-relative addressing instead of 32-bit absolute.Ahriman
Thank you for explaining how to link! You saved my mental health. I was starting to pull my hair out over 'error LNK2001: unresolved external symbol ExitProcess' and similar errors...Summand
L
6

Unless you call some function this is not at all trivial. (And, seriously, there's no real difference in complexity between calling printf and calling a win32 api function.)

Even DOS int 21h is really just a function call, even if its a different API.

If you want to do it without help you need to talk to your video hardware directly, likely writing bitmaps of the letters of "Hello world" into a framebuffer. Even then the video card is doing the work of translating those memory values into DisplayPort/HDMI/DVI/VGA signals.

Note that, really, none of this stuff all the way down to the hardware is any more interesting in ASM than in C. A "hello world" program boils down to a function call. One nice thing about ASM is that you can use any ABI you want fairly easily; you just need to know what that ABI is.

Lorileelorilyn answered 22/6, 2009 at 20:34 Comment(3)
This is an excellent point --- ASM and C both rely on an OS provided function (_WriteFile in Windows). So where is the magic? It is in the device driver code for the video card.Eau
This is thoroughly besides the point. The poster asks an assembler program that runs "under Windows". That means that Windows facilities can be used (e.g. kernel32.dll), but not other facilities like libc under Cygwin. For crying out loud, the poster explicitly says no c-libraries.Demonolater
I don't see how kernel32.dll is not a C (or at least C++) library. There are reasonable interpretations of what this questioner (or others asking similar questions) really meant to ask. "... e.g. kernel32.dll" is a fairly good one. ("e.g. int 21h" was the one I implicitly took, which is obviously dated now, but in 2009 64 bit Windows was the exception.) Other answers here cover those effectively; the point of this answer is to point out that this isn't quite the right question.Lorileelorilyn
T
6

If you want to use NASM and Visual Studio's linker (link.exe) with anderstornvig's Hello World example you will have to manually link with the C Runtime Libary that contains the printf() function.

nasm -fwin32 helloworld.asm
link.exe helloworld.obj libcmt.lib

Hope this helps someone.

Toddle answered 26/2, 2011 at 21:54 Comment(1)
The poster of the questions wants to know, how someone would write printf based on the facilities Windows provides, so this is again totally besides the point.Demonolater
D
6

The best examples are those with fasm, because fasm doesn't use a linker, which hides the complexity of windows programming by another opaque layer of complexity. If you're content with a program that writes into a gui window, then there is an example for that in fasm's example directory.

If you want a console program, that allows redirection of standard in and standard out that is also possible. There is a (helas highly non-trivial) example program available that doesn't use a gui, and works strictly with the console, that is fasm itself. This can be thinned out to the essentials. (I've written a forth compiler which is another non-gui example, but it is also non-trivial).

Such a program has the following command to generate a proper header for 32-bit executable, normally done by a linker.

FORMAT PE CONSOLE 

A section called '.idata' contains a table that helps windows during startup to couple names of functions to the runtimes addresses. It also contains a reference to KERNEL.DLL which is the Windows Operating System.

 section '.idata' import data readable writeable
    dd 0,0,0,rva kernel_name,rva kernel_table
    dd 0,0,0,0,0

  kernel_table:
    _ExitProcess@4    DD rva _ExitProcess
    CreateFile        DD rva _CreateFileA
        ...
        ...
    _GetStdHandle@4   DD rva _GetStdHandle
                      DD 0

The table format is imposed by windows and contains names that are looked up in system files, when the program is started. FASM hides some of the complexity behind the rva keyword. So _ExitProcess@4 is a fasm label and _exitProcess is a string that is looked up by Windows.

Your program is in section '.text'. If you declare that section readable writeable and executable, it is the only section you need to add.

    section '.text' code executable readable writable

You can call all the facilities you declared in the .idata section. For a console program you need _GetStdHandle to find he filedescriptors for standard in and standardout (using symbolic names like STD_INPUT_HANDLE which fasm finds in the include file win32a.inc). Once you have the file descriptors you can do WriteFile and ReadFile. All functions are described in the kernel32 documentation. You are probably aware of that or you wouldn't try assembler programming.

In summary: There is a table with asci names that couple to the windows OS. During startup this is transformed into a table of callable addresses, which you use in your program.

Demonolater answered 18/1, 2017 at 14:47 Comment(5)
FASM may not use a linker but it still has to assemble a PE file. Which means that it actually doesn't just assemble code but also takes upon itself a job normally a linker would perform, and as such it's, in my humble opinion, misleading to call absense of a linker "hiding complexity", quite on the contrary -- the job an assembler is to assemble a program, but leave it to the linker to embed the program into a program image which may depend on a lot of things. As such, I find separation between a linker and an assembler a good thing, which it appears, you disagree on.Schlueter
@amn Think about it this way. If you use a linker to create above program, does it gives you more insight in what the program does, or what it consists of? If I look at the fasm source I know the complete structure of the program.Demonolater
Fair point. On the flip side, separating linking from everything else has its benefits too. You normally have access to an object file (which goes a long way towards letting one inspect the structure of a program too, independent of the program image file format), you can invoke a different linker of your preference, with different options. It's about reusability and composability. With that in mind, FASM doing everything because it's "convenient" breaks those principles. I am not principally against it -- I see their justification for it -- but I, for one, don't need it.Schlueter
get error for illegal isntruction on top line in fasm 64 bit windowsLamprey
@bluejayke Probably you didn't have the documentation for fasm at hand. FORMAT PE generates a 32 bits executable, which a 64 bit windows refuses to run. For a 64 bit program you want FORMAT PE64 . Also make sure you use proper 64 bit instructions in your program.Demonolater
D
1

For ARM Windows:

AREA    data, DATA

Text    DCB "Hello world(text)", 0x0
Caption DCB "Hello world(caption)", 0x0

    EXPORT  WinMainCRTStartup
    IMPORT  __imp_MessageBoxA
    IMPORT  __imp_ExitProcess

    AREA    text, CODE
WinMainCRTStartup   PROC
            movs        r3,#0
            ldr         r2,Caption_ptr
            ldr         r1,Text_ptr
            movs        r0,#0
            ldr         r4,MessageBoxA_ptr    @ nearby, reachable with PC-relative
            ldr         r4,[r4]
            blx         r4

            movs        r0,#0
            ldr         r4,ExitProcess_ptr
            ldr         r4,[r4]
            blx         r4

MessageBoxA_ptr DCD __imp_MessageBoxA       @ literal pool (constants near code)
ExitProcess_ptr DCD __imp_ExitProcess
Text_ptr    DCD Text
Caption_ptr DCD Caption

    ENDP
    END
Depredation answered 12/3, 2022 at 12:48 Comment(1)
This question is tagged [x86] [nasm], so this ARM answer isn't fully on-topic here. IDK how many future readers will find it, especially if you don't even mention ARM Windows in text outside your code (I edited to fix the code formatting and fix that). A self-answered Q&A might be a better place for it, but it's probably fine to leave this answer here even though the question is primarily about [x86].Ahriman

© 2022 - 2024 — McMap. All rights reserved.