What do 'statically linked' and 'dynamically linked' mean?
Asked Answered
B

5

300

I often hear the terms 'statically linked' and 'dynamically linked', often in reference to code written in C, C++ or C#. What are they, what exactly are they talking about, and what are they linking?

Boondoggle answered 22/11, 2008 at 23:9 Comment(0)
M
546

There are (in most cases, discounting interpreted code) two stages in getting from source code (what you write) to executable code (what you run).

The first is compilation which turns source code into object modules.

The second, linking, is what combines object modules together to form an executable.

The distinction is made for, among other things, allowing third party libraries to be included in your executable without you seeing their source code (such as libraries for database access, network communications and graphical user interfaces), or for compiling code in different languages (C and assembly code for example) and then linking them all together.

When you statically link a file into an executable, the contents of that file are included at link time. In other words, the contents of the file are physically inserted into the executable that you will run.

When you link dynamically, a pointer to the file being linked in (the file name of the file, for example) is included in the executable and the contents of said file are not included at link time. It's only when you later run the executable that these dynamically linked files are brought in and they're only brought into the in-memory copy of the executable, not the one on disk.

It's basically a method of deferred linking. There's an even more deferred method (called late binding on some systems) that won't bring in the dynamically linked file until you actually try to call a function within it.

Statically-linked files are 'locked' to the executable at link time so they never change. A dynamically linked file referenced by an executable can change just by replacing the file on the disk.

This allows updates to functionality without having to re-link the code; the loader re-links every time you run it.

This is both good and bad - on one hand, it allows easier updates and bug fixes, on the other it can lead to programs ceasing to work if the updates are incompatible - this is sometimes responsible for the dreaded "DLL hell" that some people mention in that applications can be broken if you replace a dynamically linked library with one that's not compatible (developers who do this should expect to be hunted down and punished severely, by the way).


As an example, let's look at the case of a user compiling their main.c file for static and dynamic linking.

Phase     Static                    Dynamic
--------  ----------------------    ------------------------
          +---------+               +---------+
          | main.c  |               | main.c  |
          +---------+               +---------+
Compile........|.........................|...................
          +---------+ +---------+   +---------+ +--------+
          | main.o  | | crtlib  |   | main.o  | | crtimp |
          +---------+ +---------+   +---------+ +--------+
Link...........|..........|..............|...........|.......
               |          |              +-----------+
               |          |              |
          +---------+     |         +---------+ +--------+
          |  main   |-----+         |  main   | | crtdll |
          +---------+               +---------+ +--------+
Load/Run.......|.........................|..........|........
          +---------+               +---------+     |
          | main in |               | main in |-----+
          | memory  |               | memory  |
          +---------+               +---------+

You can see in the static case that the main program and C runtime library are linked together at link time (by the developers). Since the user typically cannot re-link the executable, they're stuck with the behaviour of the library.

In the dynamic case, the main program is linked with the C runtime import library (something which declares what's in the dynamic library but doesn't actually define it). This allows the linker to link even though the actual code is missing.

Then, at runtime, the operating system loader does a late linking of the main program with the C runtime DLL (dynamic link library or shared library or other nomenclature).

The owner of the C runtime can drop in a new DLL at any time to provide updates or bug fixes. As stated earlier, this has both advantages and disadvantages.

Mahratta answered 22/11, 2008 at 23:14 Comment(22)
Please correct me if I'm wrong, but on Windows, software tends to include its own libraries with the install, even if they're dynamically linked. On many Linux systems with a package manager, many dynamically linked libraries ("shared objects") are actually shared between software.Nocturnal
@PaulF: things like the Windows common controls, DirectX, .NET and so on ship a lot with the applications whereas on Linux, you tend to use apt or yum or something like that to manage dependencies - so you're right in that sense. Win Apps that ship their own code as DLLs tend not to share them.Mahratta
It is true to update just the dynamic library if JUST functionality is updated. What about new exposed functions or modified signatures are present in the DLL? I think you need to link again.Adherence
There's a special place reserved in the ninth circle of hell for those that update their DLLs and break backward compatibility. Yes, if interfaces disappear or are modified, then the dynamic linking will fall in a heap. That's why it shouldn't be done. By all means add a function2() to your DLL but don't change function() if people are using it. Best way to handle that is to recode function() in such a way the it calls function2(), but don't change the signature of function().Mahratta
@Paul Fisher, I know this is late but... the library that ships with a Windows DLL isn't the full library, it's just a bunch of stubs that tell the linker what the DLL contains. The linker can then automatically put the information into the .exe for loading the DLL, and the symbols don't show up as undefined.Infect
One other difference worth noting is that with a static library, the linker can generate a direct call to a function in the library. With dynamic libraries, the function call is always via a function address, so the function call is always an indirect call. This is the very slight performance penalty that causes people to ask if dynamic library code is slower.Malia
There's also JIT-ing. And link-time-optimization. Just to make it a bit more diverse.Killarney
NOOB here: I don't understand the flowchart. Do "crt" in "crtimp", "crtdll" and "crtlib" mean "C RunTime"? Does lib stand for library and is "crtdll" a generic example name for a dll to be linked, while "crtlib" is just a generic name for a object file to be linked to main.o? Why do "crtlib" and "crtdll" don't they have ".o" suffix?, ?Andre
@Santropedro, you're correct on all counts re the meaning of the lib, import and DLL names. The suffix is convention only so don't read too much into that (for example, the DLL may have a .dll or .so extension) - think of the answer as explaining the concepts rather than being an exact description. And, as per the text, this is an example showing static and dynamic linking for just the C runtime files so, yes, that's what `crt indicates in all of them.Mahratta
dope diagram paxdiabloFulks
Fantastic answer. "it can lead to programs ceasing to work if the updates are incompatible". So from what I know, ld validates if the Dynamic library's has signatures that satisfy the needs of the main executable. If it incompatible then things will fail at compile time, not run time. So I'm confused for why it fail at runtime. Do you mean occasionally even though the function signature is kept the same, the Dynamic library, can create incompatible updates because e.g. something synchronous was changed to asynchronous while the signature was kept the same. or some function now takes 5x time?Grot
Also in your diagram, you're showing a single entity in memory for dynamically loaded libraries. But aren't dynamic libraries their own process? Shouldn't your diagram have two entities for the section of what's loaded in memory? Or perhaps that's what you meant, but I'm just misinterpretting it. FWIW I'm speaking from the mindset of an iOS developer...but don't think things are that different...Grot
@mfaani, they tend not to be separate processes. Rather a shared library is a piece of code that's brought into the memory of the process that requested it.Mahratta
Gothca. Can you also answer my previous comment above it. I think you just missed that...Grot
Sure, @mfaani. I'm not sure ld (at compile time) does check parameter types. Specifically, I'm not sure the symbol table carries enough information for this. Things like C++ name mangling may impose a method to allow it but that's sort of an artificiality on top of the linker. But, even if it did check, there's nothing to stop somebody building a new incompatible shared library and dropping it over the previous one (that's one of the strengths of using them).Mahratta
The loader itself (responsible for loading shared objects and fixing up symbol references as needed) doesn't, from my knowledge, do any other checking.Mahratta
I think I get it. For a third party iOS app, I can link with foo (v1) dylib. Submit to app store. Then 2 months later, link with foo (v2) dylib. Submit to app store again. I don't have any way to download the dylib into the app, after I've submitted to app store. Nor I can submit to app store without compiling with the dylib. So anything incompatible will be caught. The exception to this is: Apple's OS dylibs. They're used to update the OS itself. Apple does these OS version bumps outside the scope of third-party apps compilation. As a result you can have incompatible dylibs. Is that correct?Grot
Unsure, @mfaani, iOS probably acts differently to other systems. I would think that, even though you dynalink, your app would hopefully get its own copy of the library (in its own area). Otherwise, installing app1 may totally break app2. I can't see Apple allowing that possibility, though I've been wrong before :-)Mahratta
I'm not sure I follow what you meant by "installing app1 may totally break app2". What I meant was: Example: FooApp, is compiled with UIKit (iOS16). Then Apple introduced UIKit (iOS17). FooApp doesn't need to get recompiled with UKit (iOS17) i.e. you don't need another app store submission. Like you can submit to app store 3 yrs ago and have your app still work 3 new iOS versions — all because Apple's UIKit updates weren't breaking. If they did break though then yeah, folks would go and hunt Apple Engineers down...because they (OS) dylib from Apple broke their app...Grot
Let us continue this discussion in chat.Mahratta
Thinking of non Apple things. Let's say two apps both use a btree library, not included with Apple standard stuff. If they shared it, there could possibly be breakage when one app updates it. If each got their own, that couldn't happen. Still shared but only within a single app.Mahratta
That's known as DLL Hell, by the way, and there are specific ways to avoid it, using version numbers and symbolic links. Looks like this has also been explained in the chat, which I'll start using in preference to here.Mahratta
G
258

I think a good answer to this question ought to explain what linking is.

When you compile some C code (for instance), it is translated to machine language. Just a sequence of bytes which, when run, causes the processor to add, subtract, compare, "goto", read memory, write memory, that sort of thing. This stuff is stored in object (.o) files.

Now, a long time ago, computer scientists invented this "subroutine" thing. Execute-this-chunk-of-code-and-return-here. It wasn't too long before they realised that the most useful subroutines could be stored in a special place that allows it to be used by any program that needed them.

Now in the early days programmers would have to punch in the memory address that these subroutines were located at. Something like CALL 0x5A62. This was tedious and problematic should those memory addresses ever need to be changed.

So, the process was automated. You write a program that calls printf(), and the compiler doesn't know the memory address of printf. So the compiler just writes CALL 0x0000, and adds a note to the object file saying "must replace this 0x0000 with the memory location of printf".

Static linkage means that the linker program (the GNU one is called ld) adds printf's machine code directly to your executable file, and changes the 0x0000 to the address of printf. This happens when your executable is created.

Dynamic linkage means that the above step doesn't happen. The executable file still has a note that says "must replace 0x000 with the memory location of printf". The operating system's loader needs to find the printf code, load it into memory, and correct the CALL address, each time the program is run.

It's common for programs to call some functions which will be statically linked (standard library functions like printf are usually statically linked) and other functions which are dynamically linked. The static ones "become part" of the executable and the dynamic ones "join in" when the executable is run.

There are advantages and disadvantages to both methods, and there are differences between operating systems. But since you didn't ask, I'll end this here.

Gynandrous answered 23/11, 2008 at 0:2 Comment(3)
Artelius, i am looking some in depth about your explanation about how these crazy low level things works. please reply with what books we must read to get indepth knowledge about the above things. thank you.Occupy
Sorry, I can't suggest any books. You should learn assembly language first. Then Wikipedia can give a decent overview of such topics. You may want to look at the GNU ld documentation.Gynandrous
This should be the top answer. Very clear and concise. Thank you @Gynandrous and Peter.Scrope
S
43

Statically linked libraries are linked in at compile time. Dynamically linked libraries are loaded at run time. Static linking bakes the library bit into your executable. Dynamic linking only bakes in a reference to the library; the bits for the dynamic library exist elsewhere and could be swapped out later.

Subservience answered 22/11, 2008 at 23:13 Comment(0)
S
26

Because none of the above posts actually show how to statically link something and see that you did it correctly so I will address this issue:

A simple C program

#include <stdio.h>

int main(void)
{
    printf("This is a string\n");
    return 0;
}

Dynamically link the C program

gcc simpleprog.c -o simpleprog

And run file on the binary:

file simpleprog 

And that will show it is dynamically linked something along the lines of:

simpleprog: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.26, BuildID[sha1]=0xf715572611a8b04f686809d90d1c0d75c6028f0f, not stripped

Instead let us statically link the program this time:

gcc simpleprog.c -static -o simpleprog

Running file on this statically linked binary will show:

file simpleprog 

Now the result will be

simpleprog: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.26, BuildID[sha1]=0x8c0b12250801c5a7c7434647b7dc65a644d6132b, not stripped

And you can see it is happily statically linked. Sadly however not all libraries are simple to statically link this way and may require extended effort using libtool or linking the object code and C libraries by hand.

Luckily many embedded C libraries like musl offer static linking options for nearly all if not all of their libraries.

Now strace the binary you have created and you can see that there are no libraries accessed before the program begins:

strace ./simpleprog

Now compare with the output of strace on the dynamically linked program and you will see that the statically linked version's strace is much shorter!

Sawmill answered 6/10, 2015 at 3:30 Comment(0)
S
2

(I don't know C# but it is interesting to have a static linking concept for a VM language)

Dynamic linking involves knowing how to find a required functionality which you only have a reference from your program. You language runtime or OS search for a piece of code on the filesystem, network or compiled code cache, matching the reference, and then takes several measures to integrate it to your program image in the memory, like relocation. They are all done at runtime. It can be done either manually or by the compiler. There is ability to update with a risk of messing up (namely, DLL hell).

Static linking is done at compile time that, you tell the compiler where all the functional parts are and instruct it to integrate them. There are no searching, no ambiguity, no ability to update without a recompile. All your dependencies are physically one with your program image.

Setscrew answered 22/11, 2008 at 23:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.