Creating a fast interpreted language
Asked Answered
A

3

6

I'm currently creating a programing language written in Nim.

The front-end of the compiler is done, I'm currently sitting in front of a well-built Abstract Syntax Tree (AST) and I tried implementing a simple interpreter, calling an evaluate() method on each tree node. This worked, hell yes, I even made environments for functions and stuff. BUT, turns out to be ~ 15-20 times slower than python. Python runs on a virtual machine and translates the source program into bytecode. Other languages use JIT compilation. None of this two things are easy to implement, but what's really sad for me is not beeing able to find any single book that try to teach you how to merge this two worlds, it's either building a VM that's useful alone, or building a compiled language.

Tools like LLVM and GraalVM can help but again, I can't see how to link my AST to these things.

What should my next step be? JIT / VM?

If VM: Any recommendations on how to transform the AST into bytecode and create a VM for it?

If JIT: How do you even compile things in a dynamic language. For example

fun add(a, b) {
    return a + b;
}

The interpreter knows the type of a and b only at run-time, thus can't compile that until it's found, but if you compile it to machine code, the arguments must be known, thus what happens of the next call is of different argument types, recompile?

I'm kinda confused in this mattes, any lighting would be appreciated.

Thanks in advance!

Allotropy answered 19/6, 2018 at 22:44 Comment(3)
If your goal is to be on par with Python, then being within 15-20 percent is close enough that you should profile your interpreter and look for simple optimizations that you can make before you take the plunge and move to a more complex model.Armour
I made a mistake. I meant 15-20 TIMES slower... My bad there!Allotropy
Oh, well that's different :)Armour
C
5

You're hoping for a single book that describes how to build ultra-high performance interpreters. To do that, you essentially blur "interpreter" with "compiler" to gain efficiency. To do that, the simple answer is, use every compiler trick in the book(s significantly plural). You've got a lot of reading to do.

However, the core of what you want know can be found in the papers about SELF, a fast runtime "interpreter" which sort of defined how JIT compilers should work, especially in the face of dynamic typing:

An efficient implementation of SELF a dynamically-typed object-oriented language based on prototypes, (Chambers/Ungar) ACM Sigplan Notices. A PDF is available here: https://www.researchgate.net/profile/David_Ungar2/publication/234781317_An_Efficient_Implementation_of_Self_a_Dynamically-Typed_Object-Oriented_Language_Based_on_Prototypes/links/540f8fbe0cf2f2b29a3de0a6.pdf

You can find out more technical papers on this topic by going to scholar.google.com and searching for "JIT Compilers" and anything by "Craig Chambers".

Csc answered 20/6, 2018 at 8:4 Comment(0)
D
0

While not entirely directly related, you may be interested in investigating how RPython ("reduced" Python) handles initial code analysis. After generating a flow graph it strictly validates variable single-typing (despite CPython being a fully dynamic language, Pypy is a Python interpreter written in RPython supporting full dynamism via boxing). RPython also naturally integrates a flow-graph level JIT (utilizing "guard" instructions to allow specialized code to safely fall back when expectations are violated) without too much in the way of explicit annotation, "gifting" that feature to any language's interpreter written using it. Plus it provides multiple memory management schemes (e.g. multigenerational mark/sweep).

To summarize: even though variables are "dynamic", the code generally isn't, so static analysis can provide answers as to the type of every variable.

Demythologize answered 28/6, 2018 at 18:48 Comment(2)
The problem with RPython is how little docs are there. I feel lost in there, I've read they give you a JIT compiler "for free" with your interpreter. But idk if it's worth it...Allotropy
@ErikCampobadal It can be, depending on the performance characteristics of your solution, noting that you benefit from it even with ordinary Python code running under Pypy itself. For example, my template engine goes from 38,775 generations/second on CPython 2.7, to 45,680/second under Pypy. Of course, modern CPythons are getting better and better all the time, e.g. the same test under CPython 3.7 nets 47,938/second. (Rather important and related note: this is not a parser/interpreter for template code; it's a text transformation pipeline at import time.)Demythologize
I
0

If you're interested in building an interpreter for your language on GraalVM, please take a look at simple language, which is a sample programming language built for GraalVM to illustrate how to do it.

In a nutshell, one needs to describe the AST of the program using Truffle, which is a framework for implementing programming langauges GraalVM uses. It offers an API to describe the nodes in your AST and what they supposed to do.

You can find more examples of the languages implemented on Truffle here. If you have more detailed questions about it, perhaps the gitter chat is the easiest place to ask them.

Ingrained answered 3/7, 2018 at 12:7 Comment(4)
You're restricted to java, can't compile a native implementation and then what? :CAllotropy
Sorry, I don't understand what do you mean by that, can you please elaborate? But just in case, you absolutely can run an implementation of your language without a dependency on the JVM. TruffleRuby for example does that, you can check the the "truffleruby-1.0.0-rc3-linux-amd64.tar.gz" release from the github.com/oracle/truffleruby/releases. It runs proper TruffleRuby with the JIT compiler and everything but doesn't depend on the JVM.Transitive
I will check it out then. I just thought u were restricted to the GraalVM that's only available on linux for free.Allotropy
Windows support is in the works (in the JVM mode it works, but native image compilation isn't there yet), GraalVM CE builds for macos aren't there because of lack of open source JDK8 with JVMCI we could have used, should become available at least with the JDK11 based builds.Transitive

© 2022 - 2024 — McMap. All rights reserved.