Can a compiled language be homoiconic?
Asked Answered
N

10

29

By definition the word homoiconic means:

Same representation of code and data

In LISP this means that you could have a quoted list and evaluate it, so (car list) would be the function and (cdr list) the arguments. This can either happen at compile- or at run-time, however it requires an interpreter.

Is it possible that compiled languages without a compile-time interpreter can be homoiconic as well? Or is the concept of homoiconicity limited to interpreters?

Notus answered 6/8, 2009 at 13:23 Comment(4)
As a question, do you consider Perl to be homoiconic? It can represent its own code as a string, and has an eval() function.Lashay
Is it my imagination, or did we have a drive-by downvote of every answer on this page?Showcase
David: unless you store all Perl data in strings, and call all Perl functions with eval(), then no, I would not consider Perl homoiconic. :-)Trihedral
Assembly is homoiconic.Advection
A
37

'Homoiconic' is kind of a vague construct. 'code is data' is a bit clearer.

Anyway, the first sentence on Wikipedia for Homoiconic is not that bad. It says that the language has to have a source representation using its data structures. If we forget 'strings' as source representation (that's trivial and not that helpful to have a useful concept 'homoiconic'), then Lisp has lists, symbols, numbers, strings etc. which are used to represent the source code. The interface of the EVAL function determines what kind of source representation the language is working on. In this case, Lisp, it is not strings. EVAL expects the usual variety of data structures and the evaluation rules of Lisp determine that a string evaluates to itself (and thus will not be interpreted as a program expression, but just string data). A number also evaluates to itself. A list (sin 3.0) is a list of a symbol and a number. The evaluation rules say that this list with a symbol denoting a function as the first object will be evaluated as a function application. There are a few evaluation rules like this for data, special operators, macro applications and function applications. That's it.

To make it clear: in Lisp the function EVAL is defined over Lisp data structures. It expects a data structure, evaluates it according to its evaluation rules and returns a result - again using its data structures.

This matches the definition of homoiconic: source code has a native representation using the data types of Lisp.

Now, the interesting part is this: it does not matter how EVAL is implemented. All that matters is that it accepts the source code using the Lisp data structures, that it executes the code and that it returns a result.

So it is perfectly legal that EVAL uses a compiler.

(EVAL code)  =  (run (compile-expression code))

That's how several Lisp system work, some don't even have an Interpreter.

So, 'Homoiconic' says that the SOURCE code has a data representation. It does NOT say that at runtime this source code has to be interpreted or that the execution is based on this source code.

If the code is compiled, neither the compiler nor an interpreter is needed at runtime. Those would only be needed if the program wants to eval or compile code at runtime - something that is often not needed.

Lisp also provides a primitive function READ, which translates an external representation (S-Expressions) of data into an internal representation of data (Lisp data). Thus it also can be used to translate an external representation of source code into an internal representation of source code. Lisp does not use a special parser for source code - since code is data, there is only READ.

Ammeter answered 6/8, 2009 at 14:19 Comment(9)
What's vague about "homoiconic/ity"? The exact Wikipedia quote says: "In computer programming, homoiconicity is a property of some programming languages, in which the primary representation of programs is also a data structure in a primitive type of the language itself, from homo meaning the same and icon meaning representation". That being said, you are right in pointing out that this is not a concept that properly applies to the discussion of compilers and interpreters. What one "does" with a language is different from how that language is structured.Omnidirectional
Why do you exclude Strings? If you exclude Strings, you exclude the very language for which the word "homoiconic" was invented in the first place!Parallelepiped
Excluding strings also disqualifies a number of languages that have a string-based evaluation function, like JavaScript. It just happens that S-expressions are more convenient and less error-prone.Nucleolus
The vague part already starts with the word 'representation'. What's that? Text on paper? Bits in a computer? How are representations connected? For me 'Homoiconic' (the name) suggests that the representation is something visual like text on paper/screen/... But then we cross the line between internal and external representations.Ammeter
'String' is meaningless, since all text-based programming languages have a string representation. In Java you can build a string, write it out compile it, load it the result with the class loader. Does that make Java 'homoiconic'? If yes, the whole concept of 'homoiconicity' is useless. The idea that 'homoiconic' may want to capture is that code has an internal representation that is structured data (-> not plain strings).Ammeter
and that these internal data structures itself (not the code, just a data format) have an external representation, that is also used to write programs with. Lisp: internal representation of Lisp is Lisp data. Lisp data has an external representation as S-expressions. S-Expressions are used for program code, too. -> homoiconicAmmeter
@Rainer Joswig - Thanks for that clarification. So, really, you're unhappy with the ambiguity of the word 'representation' combined with a strongly 'visual' take on the meaning of 'homoiconic'. Would you be happier if a definition of 'homoiconic' specified 'internal representation', or is it that 'homoiconic' imposes the visual experience metaphor too strongly for you? I'm asking the question because I have often found that discussions of technical/scientific subjects is clouded by unwanted meaning 'bleeding' out of words, so I like to collect examples of it for pondering the problem.Omnidirectional
@Pinochle,right, if you look at 'iconicity' in other contexts, it connects form and meaning. This is all vague in the context of programming languages, and the specific definition of 'homoiconicity' is also vague and the term 'homoiconicity' suggests a meaning that does not really match the description. 'icon' is related to an external form, a sign or a visual analogy. The interesting part in Lisp is that Lisp data has an internal and external representation, and that both representations are primary means to encode source code (externally as s-expressions, internally as Lisp data structures).Ammeter
@Rainer Joswig - I just read your response to Jimmy Miller on c.l.l. from Aug. 4. That, coupled with what you said here, has convinced me - -- "code is data" is a better way of describing the phenomenon than using "homoiconicity" which is far to imagistic and potentially misleading.Omnidirectional
O
8

yes. lisp can be compiled to a native binary

Outroar answered 6/8, 2009 at 13:26 Comment(9)
I didn't deny this, but the binary still contains an interpreter.Notus
@ott and a CPU contains microcode that is an interpreter of sorts ... everything is at some level, an interpreter.Treasonous
I don't care about the underlying hardware. Let's pretend that the instruction set is hard-wired in silicon.Notus
@ott, it doesn't matter ... homoiconicity is an attribute at some defined level of abstraction: You can have a homoiconic language compiled to an arbitary level ... and that level may/may not express homoiconic semantics and so on ...Treasonous
the binary does not need an Interpreter if the code is compiled. Some Lisps don't even haven an interpreter.Ammeter
@Rainer, not even mentioning the LISPM / Lisp-on-a-chip stuff scribd.com/doc/938809/Design-of-a-LISPBased-MicroprocessorTreasonous
@ott - in that case how about Machine LanguageOutroar
@ott: no, a microprogrammed CPU is not the only possible design; in some CPUs instruction decoding is hardwired for all instructions.Dissentient
@dsm: "machine language" is the lowest programming level accessible externally; in a microprogrammed CPU there is another programming level (microcode) only accessible to a CPU maker.Dissentient
T
3

Seems to me to be an odd question:

Firstly, the homoiconic portion is the presented interface to the programmer. The point of languages is that they abstract a lower level functionality that preserves the same semantics as the higher level presentation (though a different means).

dsm's machine-code point is a good point, but providing:

  1. The syntax and semantics presented are homoiconic
  2. The translation to a lower level form (machine code or interpreted or otherwise) doesn't remove any of the original semantics then

why does the lower level implementation matter here?

Also:

compiled languages without a compile-time interpreter

Without some program interpreting it, it would be required to be native to the CPU, therefore the CPU's native language would be required to be homoiconic (or the VM running the code).

Languages without compile-time interpretation ... would be fairly constrained ... as they wouldn't be compiled at all.

But I am no expert, and maybe missing the point.

Treasonous answered 6/8, 2009 at 13:30 Comment(2)
Procedural macros are an example of compile-time interpretation.Notus
Yes, procedural macros are an example of compile-time evaluation (it's irrelevant whether this is done through interpretation or compilation) -- but such macros can be done completely outside of the object language. For example, you can use any preprocessor language as a first step in compiling code in any other language.Nucleolus
D
2

In the most literal form, C is homoiconic. You can get access to the representation of a function using &functionName and execute data using somePtrCastToFnPtr(SomeArgs). However this is at the machine code level and without some kind of library support you will find it very hard to work with. Some kind of embeddable compiler (I seem to remember that LLVM can do this) would make it more practical.

Dipterous answered 6/8, 2009 at 16:25 Comment(0)
L
2

Lisp is normally compiled. There have been implementations with JIT compilers instead of interpreters.

Hence, it is not necessary to have an interpreter (in the sense of "not a compiler") for code-is-data languages.

Lashay answered 7/8, 2009 at 16:44 Comment(0)
G
2

Machine code itself is homoiconic, so yes.

Data or instructions are just a matter of semantics (and perhaps the segment of memory in which them lie).

Glasgo answered 9/9, 2009 at 22:48 Comment(0)
M
1

The problem is that a lot of processors separate instruction and data areas, and actively prevent programs from modifying their own code. This kind of code used to be called "degenerate code", and considered a very Bad Thing.

Interpreters (and VMs) don't have that problem, as they can treat the whole program as data, with the only "code" being the interpreter.

Mothy answered 6/8, 2009 at 14:47 Comment(0)
S
1

Yes; you just have to stick a copy of the compiler into the language runtime. Chez Scheme is one of the many fine compilers which do just that.

Showcase answered 7/8, 2009 at 16:29 Comment(0)
A
1

Compiling is just optimized interpretation. An interpreter takes a piece of data representing code and then "does" that code: the code's meaning turns into pathways of execution and data flow through the guts of the interpreter. A compiler takes the same data, translates it into another form and then passes it to another interpreter: one implemented in silicon (CPU) or perhaps a fake one (virtual machine).

This why some Lisp implementations are able not to have intepreters. The EVAL function can compile the code and then branch to it. EVAL and COMPILE do not have to have distinct modes of operation. (Clozure, Corman Lisp, SBCL are examples of "compiler only" Lisps.)

The data part in the beginning is the key to the language being homoiconic, not whether or not the execution of code is optimized by compiling. "Code is data" means "source code is data" not "executable code is data". (Of course executable code is data, but by data we mean the overwhelmingly preferred representation of the code that we want to manipulate.)

Autogenesis answered 3/5, 2012 at 21:59 Comment(0)
H
0

Languages built on top of VM's (.net clr, jre ect) can use advanced technics that allow on the fly code generation. One of them is IL weaving. Although, its not as clear as eval's of ECMAScript/Lisp/Scheme ect but it can to a some degree emulate such behavior.

For examples check Castle DynamicProxy and for more interactive example check LinqPAD, F# Interactive, Scala interactive.

Hanschen answered 6/8, 2009 at 13:41 Comment(1)
I don't get what this had to do with my question.Notus

© 2022 - 2024 — McMap. All rights reserved.