Assembly: dynamic memory allocation without malloc and syscalls? [FreeDOS application]
Asked Answered
P

1

5

My question is about the logic of dynamic memory allocation in assembly (particularly, MASM). There are lot of articles on this topic and all of them rely on the use of malloc or brk. However, according to my understanding, malloc as a part of C language must (or could) be certainly written on assembly. Idem for brk, because it's a part of the operating system, thus also written on C which can be replaced 1 to 1 by assembly. Very very long time ago I have seen an article in PCMag about dynamic memory allocation in MS-DOS using pure asm. Unfortunately, I have lost all the traces of this wonderful piece of writing. Now I'm working with FreeDOS (precisely bootable FreeDOS flash card) and wondering how to proceed if someone decides to write his own memory allocator? What is the starting point and the logic of memory allocation without relying on OS mechanisms?

Picnic answered 28/6, 2019 at 1:19 Comment(9)
Your "pure asm" examples likely relied on an interrupt or something. Either that, or they allocated their own large static blocks of memory and used that as their heap with which to allocate their own "dynamic" blocks from. Assembly has a very static view of memory and as such your custom malloc implementation would either rely on some sort of system call/interrupt or just a large static block of memory allocated as part of the object file (BSS/data segments)Bellwort
@Simon Whitehead, since interrupts are all provided by BIOS, then it's possible to use pure asm without interacting with OS (at least with MS DOS). Question is: how to do it?Picnic
Sure, MS-DOS the BIOS was involved a lot. Modern operating systems register their own interrupt handlers and will service them at user level. Modern operating systems like to have complete control over the memory space for security and stability reasons. So it is my understanding that there is no "pure asm" way of allocating memory dynamically in modern systems, unless you allocate a large enough block of static memory in the object file itself and write a custom allocator to allocate blocks from it. I am happy to be proven wrong here ...Bellwort
Isn't FreeDOS a MSDOS clone? Doesn't that mean that it doesn't implement virtual memory spaces for programs (i.e. it is a single address space operating system)? If that's true, there is no need for brk/mmap/malloc, since you already have access to all memory anyway.Midis
What you're missing here is that if you don't use FreeDOS's allocator you won't know what memory FreeDOS (and other things) have already allocated, and FreeDOS won't know what memory you've allocated. What you can do is allocate a big chunk of memory using FreeDOS and then suballocate it with your own allocator. Note that you can't just write brk() yourself, since on Unix-type systems it's a system call that maps in memory into the process, something that needs to be done in the kernel. In other words, to perform memory allocation at the lowest level you need to write your own OS.Pichardo
@Midis : no you don't necessarily have access to all memory, unless you want to clobber MS-DOS and other apps rendering the system unusable. Usually DOS programs will request extra memory for HEAP operations beyond the minimum requirements of the program. Incidentally I wrote a somewhat related SO answer recently about MS-DOS allocations at load time. https://mcmap.net/q/1403379/-how-can-i-get-an-extra-segment-in-dos . You can request available space for your heap and then you write an allocator (malloc/free) that uses the chunk of memory you requested from MS-DOS.Tarbox
since interrupts are all provided by BIOS - No, the ABI for DOS system calls is int 21h with AH= call number. The BIOS uses a few different interrupt numbers, but it's not the only thing callable via a software-interrupt.Arias
In real mode, memory allocation is weird anyway. Unless writing a TSR, you always have the full memory available.Nectareous
Of course programs in DOS are written knowing what memory they have been allocated. All programs can write anywhere, but in order to be functional they usually attempt to play nice with each other. That usually means not arbitrarily walking allover MS-DOS and other apps. You want memory you request it. You have too much memory, you give it back.Tarbox
A
7

When DOS loads a .COM program, it allocates all of the memory available in the 640KB area (below 0a000h:00000h) to the program, and the program can manage its own memory. If it is desired to use MSDOS memory management, the program first has to release the memory using INT 21H, AH=49H, ES=segment, BX=# paragraphs. It can then use INT 21H, AH=48H, BX=# paragraphs, to allocate memory.

As noted in the comments, an .EXE program may or may not allocate all of the memory in the 640KB area.

Example .COM assembly code, to release, and then allocate all available memory. MSDOS will generally consume 16 bytes for its overhead. In this example, BX is set to the end of the code, then set to the next paragraph boundary that is 256 bytes past the end of the code to use as stack space. The end of this stack is the base of the memory released by the INT 21H, AH=4AH call.

        .286
        .model  tiny,c
        .code
        org     0100h
;       cs,ds,es,ss = program segment prefix, sp = 0fffeh
start:  mov     bx,offset cdend         ;set bx=end stack
        add     bx,0010fh
        and     bx,0fff0h
        mov     sp,bx                   ;sp = new end of stack
        mov     cl,4                    ;release memory
        shr     bx,cl
        mov     ax,04a00h
        int     21h
        mov     ax,04800h               ;set bx = available memory
        mov     bx,0ffffh
        int     21h
        mov     ax,04800h               ;allocate all of it
        int     21h                     ; returns segment in ax
exit:   mov     ax,04c00h               ;exit
        int     21h
cdend:
        end     start
Axial answered 28/6, 2019 at 5:37 Comment(12)
There are exceptions with EXE programs that don't request all the memory to be allocated when it is loaded. That is dependent on the tools used to generate the executable or the use of the EXEMOD program. It just so happens that many tools used a MAXALLOC value of 0xffff that have the effect of allocating all memory. COM programs though are always allocated all the memory. The limit of memory also varies and may be higher or lower than A000 for varying reasons. The paragraph after allocated memory to a program can be found in the PSP at offset 0x0002Tarbox
Some of these things are discussed in this SO answer: https://mcmap.net/q/1403379/-how-can-i-get-an-extra-segment-in-dosTarbox
@MichaelPetch: I think this question needs a kind of "stub" answer that addresses the apparent misconception in the question (that you can ignore DOS and not make any DOS system calls) and links to your answer for full details. It's not quite a duplicate.Arias
MS-DOS also only allocates the largest free memory block available. Though normally unlikely, there could be potentially be existing allocations fragmenting conventional memory. Much more likely is the fact that memory at the end of conventional memory could in use by the EBDA. TSRs and device drivers also sometimes allocated memory at the end of conventional memory to avoid fragmentation.Pichardo
@PeterCordes : I don't believe this is a duplicate of the other which is why I haven't attempted to close it as such, but there is definite overlap in the discussion between it and this question.Tarbox
@rossridge : another thing that DOS does... if a program has a MINALLOC=MACALLOC=0 in the MZ header of an EXE DOS will attempt to load the program at the top of available memory rather than the bottomTarbox
@MichaelPetch: Yes that's what I said: not a duplicate, only similar. Much of the stuff in your existing answer would be relevant here, along with all the gotchas people are mentioning on comments. So an answer that links to that for how to get a big block to carve up manually would be good, if anyone wants to write one.Arias
Following your suggestion, I have found these interrupts: INT 21H (0x21) Function 48H (0x48 or 72) --> Allocate memory block; INT 21H (0x21) Function 49H (0x49 or 73) --> Release memory block. So, I need to put in BX number of memory blocks what I want to allocate. Here comes my big question: how do I calculate the max size of this memory chunk? How can I know it will not kill some other processes? Where does heap begin and where does it end? Physically, where is it: on the RAM chip or on the hard drive (because INT 21H, AH=48H is for hard drive management)?Picnic
@Picnic - I added example code for a small .COM program. The program sets the stack pointer to the next paragraph boundary 256 bytes past the end of the code, then releases memory starting with the stack boundary using INT 21H, AH = 4AH. It then attempts to allocate 0ffffh paragraphs, will will fail, but return the number of available paragraphs in BX, which is used to do the actual allocation.Axial
Uhh... Actually I don't know how to run .com programs. I use masm32/ml or masm611/ml package to create .exe files.Picnic
@Picnic - For Masm 6.11, the .model tiny directive will result in a .COM file (you may get a warning from the linker, but it works).Axial
@Picnic you can use all versions of MASM prior to 6 to assemble .COM files using the older format boilerplate: #5767840Johannessen

© 2022 - 2024 — McMap. All rights reserved.