How does interaction with computer hardware look in Lisp?
Asked Answered
C

3

6

In C, it is easy to manipulate memory and hardware registers, because concepts such as "address" and "volatile" are built into the language. Consequently, most OSs are written in the C family of languages. For example, I can copy an arbitrary function to an arbitrary location in memory, then call that location as a function (assuming the hardware doesn't stop me from executing data, of course; this would work on certain microcontrollers).

int hello_world()
{
    printf("Hello, world!");
    return 0;
}

int main()
{
    unsigned char buf[1000];
    memcpy(buf, (const void*)hello_world, sizeof buf);
    int (*x)() = (int(*)())buf;
    x();
}

However, I have been reading about the Open Genera operating system for certain dedicated Lisp machines. Wikipedia says:

Genera is written completely in Lisp; even all the low-level system code is written in Lisp (device drivers, garbage collection, process scheduler, network stacks, etc.)

I am completely new to Lisp, but this seems like a difficult thing to do: Common Lisp, from what I've seen, doesn't have good abstractions for the hardware it's running on. How would Common Lisp operating systems do something basic such as compile the following trivial function, write its machine code representation to memory, then call it?

(defun hello () (format t "Hello, World!"))

Of course, Lisp can be easily implemented in itself, but in the words of Sam Hughes, "somewhere down the line, abstraction runs out and a machine has to execute an instruction."

Comparison answered 28/7, 2015 at 19:2 Comment(8)
The compiler translates the Lisp code to machine code that deals with memory addresses.Gaily
How would the compiler emit the machine code? It is also written in Lisp.Comparison
It writes it to a file.Gaily
@thirtythreeforty machine code is just bytes. Any language that can produce bytes can write machine code.Kathyrnkati
@Kathyrnkati sure, but how is this accomplished in Lisp? Barmar mentions low-level internal functions, but I still don't see how Lisp represents the machine it's running on.Comparison
@thirtythreeforty Neither does C to any great extent, so you might build certain extension into the compiler which looks like normal functions, but might ultimately e.g. be written in assembly/machine code, or you create the ability to incorporate/link to code written in another language (such as assembly)Kathyrnkati
Every real world programming system has a low-level machine interface. So does every Lisp system. There are really a multitude of Lisp systems available as open source. Just download one them and read the sources for the compiler or the relevant documentation. Some parts of the internals may be written in assembler or C, those can be called by Lisp. For compiling functions, Common Lisp comes with the standard function COMPILE, which compiles a function, writes the machine code to memory and then one can call the machine code function from Lisp.Configurationism
> In C, it is easy to manipulate memory and hardware registers, because concepts such as "address" and "volatile" are built into the language. That is actually false. Hardware-specific C code almost always uses various compiler extensions. The Linux kernel isn't written in ISO C, for example. Things like volatile don't guarantee enough and many aspects of the machine (even access to memory-mapped things) are not exposed through C. Features like assembly inline code make up for this. C itself is improving in this area. For instance, now ISO C has atomic operations.Dowland
P
4

The Lisp machine was a computer hardware with a CPU just like modern machines today, only the CPU had some special instructions that mapped better to LISP. It still was a stack machine and it compiled it's source to CPU instructions just as modern Common Lisp implementations do today on more general CPUs.

In the Lisp machines wikipedia page you can see how a function gets compiled:

(defun example-count (predicate list)
  (let ((count 0))
    (dolist (i list count)
      (when (funcall predicate i)
        (incf count)))))

(disassemble (compile #'example-count))

  0  ENTRY: 2 REQUIRED, 0 OPTIONAL      ;Creating PREDICATE and LIST
  2  PUSH 0                             ;Creating COUNT
  3  PUSH FP|3                          ;LIST
  4  PUSH NIL                           ;Creating I
  5  BRANCH 15
  6  SET-TO-CDR-PUSH-CAR FP|5
  7  SET-SP-TO-ADDRESS-SAVE-TOS SP|-1
 10  START-CALL FP|2                    ;PREDICATE
 11  PUSH FP|6                          ;I
 12  FINISH-CALL-1-VALUE
 13  BRANCH-FALSE 15
 14  INCREMENT FP|4                     ;COUNT
 15  ENDP FP|5
 16  BRANCH-FALSE 6
 17  SET-SP-TO-ADDRESS SP|-2
 20  RETURN-SINGLE-STACK

This is then stored in some memory place and when running this function it just jumps or calls to this. As with any assembly code the CPU gets instructed to continue running some other code when it's done running this and it may be the Lisp main loop itself (REPL).

The same code compiled with SBCL:

; Size: 203 bytes
; 02CB9181:       48C745E800000000 MOV QWORD PTR [RBP-24], 0  ; no-arg-parsing entry point
;      189:       488B4DF0         MOV RCX, [RBP-16]
;      18D:       48894DE0         MOV [RBP-32], RCX
;      191:       660F1F840000000000 NOP
;      19A:       660F1F440000     NOP
;      1A0: L0:   488B4DE0         MOV RCX, [RBP-32]
;      1A4:       8D41F9           LEA EAX, [RCX-7]
;      1A7:       A80F             TEST AL, 15
;      1A9:       0F8598000000     JNE L2
;      1AF:       4881F917001020   CMP RCX, 537919511
;      1B6:       750A             JNE L1
;      1B8:       488B55E8         MOV RDX, [RBP-24]
;      1BC:       488BE5           MOV RSP, RBP
;      1BF:       F8               CLC
;      1C0:       5D               POP RBP
;      1C1:       C3               RET
;      1C2: L1:   488B45E0         MOV RAX, [RBP-32]
;      1C6:       488B40F9         MOV RAX, [RAX-7]
;      1CA:       488945D8         MOV [RBP-40], RAX
;      1CE:       488B45E0         MOV RAX, [RBP-32]
;      1D2:       488B4801         MOV RCX, [RAX+1]
;      1D6:       48894DE0         MOV [RBP-32], RCX
;      1DA:       488B55F8         MOV RDX, [RBP-8]
;      1DE:       4883EC18         SUB RSP, 24
;      1E2:       48896C2408       MOV [RSP+8], RBP
;      1E7:       488D6C2408       LEA RBP, [RSP+8]
;      1EC:       B902000000       MOV ECX, 2
;      1F1:       FF1425B80F1020   CALL QWORD PTR [#x20100FB8]  ; %COERCE-CALLABLE-TO-FUN
;      1F8:       488BC2           MOV RAX, RDX
;      1FB:       488D5C24F0       LEA RBX, [RSP-16]
;      200:       4883EC18         SUB RSP, 24
;      204:       488B55D8         MOV RDX, [RBP-40]
;      208:       B902000000       MOV ECX, 2
;      20D:       48892B           MOV [RBX], RBP
;      210:       488BEB           MOV RBP, RBX
;      213:       FF50FD           CALL QWORD PTR [RAX-3]
;      216:       480F42E3         CMOVB RSP, RBX
;      21A:       4881FA17001020   CMP RDX, 537919511
;      221:       0F8479FFFFFF     JEQ L0
;      227:       488B55E8         MOV RDX, [RBP-24]
;      22B:       BF02000000       MOV EDI, 2
;      230:       41BBF0010020     MOV R11D, 536871408        ; GENERIC-+
;      236:       41FFD3           CALL R11
;      239:       488955E8         MOV [RBP-24], RDX
;      23D:       E95EFFFFFF       JMP L0
;      242:       CC0A             BREAK 10                   ; error trap
;      244:       02               BYTE #X02
;      245:       19               BYTE #X19                  ; INVALID-ARG-COUNT-ERROR
;      246:       9A               BYTE #X9A                  ; RCX
;      247: L2:   CC0A             BREAK 10                   ; error trap
;      249:       02               BYTE #X02
;      24A:       02               BYTE #X02                  ; OBJECT-NOT-LIST-ERROR
;      24B:       9B               BYTE #X9B                  ; RCX
NIL

Not quite as few instructions is it. When running this function that is the machine code that gets control and it gives the control back to the system since the return address is perhaps the REPL or next instruction just like with compiled C.

A special thing about lisps in general is that lexical closures need to be handled. In C when a call is done the variables don't exist anymore, but in Lisps it may return or store a function that use those variables at a later time and that is no longer in scope. This means variables need to be handled almost as inefficient as in interpreted code in compiled code, especially with a old compiler that doesn't do much optimization.

A C compiler does it fare translating as well or else what would be the reason for programming C than in assembly? The Intel x86 processors doesn't have support for arguments in procedure calls. It is emulated by the C compiler. The caller sets values on the stack and it has a cleanup where it undoes it afterward. looping constructs such as for and while doesn't exist. Only branch/jmp. Yes, in C you get a more feel for the underlying hardware but it really isn't the same as machine code. It only leaks more.

A Lisp implementation as OS can have features such as low level assembly instructions as lisp opcodes. Compilation would then be to translate everything to low level lisp, then it's a 1:1 from those to machince bytes.

An operating system with a c library and a c compiler together does the very same thing. It runs translation to machine code and can then run the code in itself. This is how Lisp systems are meant to work too so the only thing you need is the API to hardware that can be as low level as memory mapping I/O etc.

Polonaise answered 28/7, 2015 at 21:43 Comment(3)
The MIT Lisp Machine was a stack machine. Relatively few hardware systems are/were stack machines. The x86 is a register machine. Most machine instructions work using registers. You can see that in the widely different machine code instructions you showed. In stack machines the arguments for most operations are on a stack. As you see in the x86 code, there are a lot register operations, where the Symbolics Lisp Machine machine code listing has none.Configurationism
@RainerJoswig I remember I hated x86 with their special purpose and very few registers. Compared to the m68k the 386 almost is a stack machine :-) From the output SBCL only uses the original registers while amd64 has 8 more today. I think some scheme compilers inline and make use of all.Polonaise
This makes sense: the Lisp compiler might implement some extra functions as native instructions. This can be an implementation detail, and user Lisp never needs to see these extensions.Comparison
S
2

Even without abstraction lisp can emit assembler. See

But it can also be used to create a thin but powerful abstraction over machine code. See Henry Bakers's Comfy Compiler

Finally check SBCL VOP's (example), they allow you to control what assembly code. Altough with virtual registers, as this happens before register allocation.

You may find this post interesting, as it deals with how to emit assembly from SBCL.

Btw, even though you can write drivers and such in lisp, it is not a good idea to needlessly duplicate the effort, so even Lisp implementations in Lisp, like SBCL, have some C parts to allow interfacing with the OS.

These C header files, along with the C source and assembly files, are then used (figure 2) to produce the sbcl executable itself. The executable is as yet not useful; while it provides an interface to the operating system services, and a garbage collector

Taken from seccion 3.2 from SBCL: a Sanely-Bootstrappable Common Lisp

I haven't checked out how Mezzano works, feel free to dig into it.

Selection answered 7/8, 2015 at 23:53 Comment(0)
G
1

Lisp Machines have a few low-level internal functions that allow them to access memory and hardware registers directly. These are used in the guts of the operating system.

Gaily answered 28/7, 2015 at 19:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.