Linking and Loading in interpreted languages
Asked Answered
M

2

10

In compiled languages, the source code is turned into object code by the compiler and the different object files (if there are multiple files) are linked by the linker and loaded into the memory by the loader for execution.

If I have an application written using an interpreted language (for eg., ruby or python) and if the source code is split across files, when exactly are the files brought together. To put it other words when is the linking done? Do interpreted languages have Linkers and Loaders in the first place or the interpreter does everything?

I am really confused about this and not able to get my head around it!! Can anyone shine some light on this?!

Matless answered 5/11, 2013 at 14:30 Comment(1)
it's important to distinguish python from cPython. Other flavors of python can and will be compiled all the way down to native machine code.Liripipe
F
8

An interpreted language is more or less a large configuration for an executable that is called interpreter. That executable (e. g. /usr/bin/python) is the program which actually runs. It then reads the script it shall execute (e. g. /home/alfe/bin/factorial.py) and executes it, in the simplest form line-by-line.

During that process it can encounter references to other files (other modules, e. g. /usr/python/lib/math.py) and then it will read and interpret those.

Many such languages have mechanisms built in to reduce the overhead of this process by creating byte-code versions of the scripts they interpreted. So there might well be a file /usr/python/lib/math.pyc for instance, which the interpreter put there after first processing and which it can faster read and interpret than the original /usr/python/lib/math.py. But this is not really part of the concept of interpreted languages¹.

Sometimes, a binary library is part of an interpreted language; depending on the sophistication of the interpreter it can link that library at runtime and then use it. This is most typical for the system modules and stuff which needs to be highly optimized.

But in general one can say that no binary machine code gets generated at all. And nothing is linked at the compile time. Actually, there is no real compile time, even though one could call that first processing of the input scripts a compile step.

Footnotes:

¹) The concept of interpreting scripts does encompass neither that "compiling" (pre-translating of the source into a faster-to-interpret form) nor that "caching" of this form by storing files like the .pyc files. WRT to your question concerning linking and splitting programs into several files or modules, these aspects of precompiling and caching are just technical details to speed up things. The concept itself is: read one line of the input script & execute it. Then read the next line and so on.

Fleurdelis answered 5/11, 2013 at 14:40 Comment(1)
clear explanation, but I don't understand what your last statement means - "But this is not really part of the concept of interpreted languages."!! Can you elaborate?Matless
N
2

Well, in Python, modules are loaded and executed or parsed when the interpreter finds some method or indication to do so. There's no linking but there is loading of course (when the file is requested in the code).

Python do something clever to improve its performance. It compiles to bytecode (.pyc files) the first time it executes a file. This improves substantially the execution of the code next time the module is imported or executed.

So the behavior is more or less:

  1. A file is executed
  2. Inside the file, the interpreter finds a reference to another file
  3. It parses it and potentially execute it. This means that every class, variable or method definition will become available in the runtime.

And this is how the process is done (very general). Of course, there are optimizations and caches to improve the performance.

Hope this helps!

Nevile answered 5/11, 2013 at 14:41 Comment(3)
great!! another question - "how is the linking and loading performed in JIT compiled languages? (eg., java, c#). do they have linkers and loaders?"Matless
Hah, that's an even tougher question :D. I know about JIT and IronPython just the basics. I suppose it's implementation rests over the DLR (Dynamic Language Runtime) which at the end of the day is a Python interpreter written in .NET and build it's object model over the DLR in order to be accessible from .NET runtime. I'm just answering very generically here because it is a specific question! Hope this bring some light or you can question about that here at SO :)Nevile
@PauloBu so does the compilation to bytecode stage already resolve which module a function call references and all the PVM has to do is load these modules and jump to the correct line?Elissa

© 2022 - 2024 — McMap. All rights reserved.