Why the weak symbol defined in the same .a file but different .o file is not used as fall back?
Asked Answered
E

1

3

I have below tree:

.
├── func1.c
├── func2.c
├── main.c
├── Makefile
├── override.c
└── weak.h
  • main.c invokes func1().
  • func1() invokes func2().
  • weak.h declares func2() as weak.
  • override.c provides an override version of func2().

func1.c

#include <stdio.h>

void func2(void);

void func1 (void)
{
    func2();
}

func2.c

#include <stdio.h>

void func2 (void)
{
    printf("in original func2()\n");
}

main.c

#include <stdio.h>

void func1();

void func2();

void main()
{
    func1();
}

override.c

#include <stdio.h>

void func2 (void)
{
    printf("in override func2()\n");
}

weak.h

__attribute__((weak))
void func2 (void); // <==== weak attribute on declaration

Makefile

ALL:
    rm -f *.a *.o
    gcc -c override.c -o override.o
    gcc -c func1.c -o func1.o -include weak.h # weak.h is used to tell func1.c that func2() is weak
    gcc -c func2.c -o func2.o
    ar cr all_weak.a func1.o func2.o
    gcc main.c all_weak.a override.o -o main

All these runs well as below:

in override func2()

But if I remove the override version of func2() from override.c as below:

#include <stdio.h>

// void func2 (void)
// {
//     printf("in override func2()\n");
// }

The build pass but the final binary gives below error at runtime:

Segmentation fault (core dumped)

And in the symbol table of ./main, the func2() is an unresolved weak symbol.

000000000000065b T func1
                 w func2 <=== func2 is a weak symbol with no default implementation

Why didn't it fall back to the func2() in the original func2.c? After all the all_weak.a already contains an implementation in func2.o:

func1.o:
0000000000000000 T func1
                 w func2 <=== func2 is [w]eak with no implementation
                 U _GLOBAL_OFFSET_TABLE_

func2.o:
0000000000000000 T func2   <=========== HERE! a strong symbol!
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

ADD 1

It seems the arrangement of translation unit also affects the fall back to the weak function.

If I put the func2() implementation into the same file/translation unit as func1() as below, the fall back to the original func2() can work.

func1.c

#include <stdio.h>

void func2 (void)
{
    printf("in original func2()\n");
}

void func1 (void)
{
    func2();
}

The symbols of all_weak.a is:

func1.o:
0000000000000013 T func1
0000000000000000 W func2 <==== func2 is still [W]eak but has default imeplementation
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

The code can fall back to the original func2() correctly if no override is provided.

This link also mentioned that the to work with the GCC alias attribute, translation unit arrangement must also be considered.

alias (“target”) The alias attribute causes the declaration to be emitted as an alias for another symbol, which must be specified. For instance,

void __f () { /* Do something. */; } void f () attribute ((weak, alias ("__f"))); defines f to be a weak alias for __f. In C++, the mangled name for the target must be used. It is an error if __f is not defined in the same translation unit.

According to the wikipedia:

The nm command identifies weak symbols in object files, libraries, and executables. On Linux a weak function symbol is marked with "W" if a weak default definition is available, and with "w" if it is not.

ADD 2 - 7:54 PM 8/7/2021

(Huge thanks to @n. 1.8e9-where's-my-share m. )

I tried these:

  • Add the __attribute__((weak)) to the func2() definition in func2.c.

  • Remove the -include weak.h from the Makefile.

Now these files look like this:

func2.c

#include <stdio.h>

__attribute__((weak))
void func2 (void)
{
    printf("in original func2()\n");
}

Makefile:

ALL:
    rm -f *.a *.o
    gcc -c override.c -o override.o
    gcc -c func1.c -o func1.o
    gcc -c func2.c -o func2.o
    ar cr all_weak.a func1.o func2.o
    gcc main.c all_weak.a -o main_original   # <=== no override.o
    gcc main.c all_weak.a override.o -o main_override # <=== override.o

The output is this:

xxx@xxx-host:~/weak_fallback$ ./main_original 
in original func2() <===== successful fall back

xxx@xxx-host:~/weak_fallback$ ./main_override
in override func2() <===== successful override

So, the conclusion is:

  • If weak a function declaration (like what I did in weak.h), it essentially tells the linker not to resolve it.

  • If weak a function definition (like what I did in func2.c), it essentially tells the linker to use it as a fallback if no strong version found.

  • If weak a function declaration, you'd better provide an override version in a .o file to the linker (like what I did in override.o). It seems linker is still willing to resolve .o file in this situation. This is the case when you cannot modify the source but still want to override some function.

And some quotation from here:

The linker will only search through libraries to resolve a reference if it cannot resolve that reference after searching all input objects. If required, the libraries are searched from left to right according to their position on the linker command line. Objects within the library will be searched by the order in which they were archived. As soon as armlink finds a symbol match for the reference, the searching is finished, even if it matches a weak definition. The ELF ABI section 4.6.1.2 says: "A weak definition does not change the rules by which object files are selected from libraries. However, if a link set contains both a weak definition and a non-weak definition, the non-weak definition will always be used." The "link set" is the set of objects that have been loaded by the linker. It does not include objects from libraries that are not required. Therefore archiving two objects where one contains the weak definition of a given symbol and the other contains the non-weak definition of that symbol, into a library or separate libraries, is not recommended.

ADD 3 - 8:47 AM 8/8/2021

As @n.1.8e9-where's-my-sharem commented:

Comment 1:

"weak" on a symbol which is not a definition means "do not resolve this symbol at link time". The linker happily obeys.

Comment 2:

"on a symbol which is not a definition" is wrong, should read "on an undefined symbol".

I think by "on an undefined symbol", he means "an undefined symbol within current translation unit". In my case, when I:

  • defined the func2() in a separated func2.c file
  • and compiled func1.c with weak.h

These essentially tell the linker do not resolve the func2() consumed in the translation unit func1.c. But it seems this "do not" only applies to .a file. If I link another .o file besides the .a file, the linker is still willing to resolve the func2(). Or if the func2() is also defined in the func1.c, linker will also resolve it. Subtle it is!

(So far, all these conclusions are based on my experiment result. It's subtle to summarize all these. If anyone can find some authoritative source, please feel free to comment or reply. Thanks!)

(Thanks to n. 1.8e9-where's-my-share m.'s comment.)

And a related thread:

Override a function call in C

Some afterthought - 9:55 PM 8/8/2021

There's no rocket science behind these subtle behaviors. It just depends on how the linker is implemented. Sometimes document is vague. You have to try it and deal with it. (If there's some big idea behind all these, please correct me and I will be more than grateful.)

Ensample answered 7/8, 2021 at 3:48 Comment(5)
"weak" on a symbol which is not a definition means "do not resolve this symbol at link time". The linker happily obeys.Indicative
@n.1.8e9-where's-my-sharem. I didn't realize the difference between weak on declaration and weak on definition. Thanks for bringing this up. Could you tell me where it is specified?Ensample
@n.1.8e9-where's-my-sharem. If you can change your comment to an answer, I will be more than happy to mark it as the answer.Ensample
Sorry, "on a symbol which is not a definition" is wrong, should read "on an undefined symbol". I cannot say where it is specified. GNU documentation is vague as usual. I think there are duplicates, e.g. here.Indicative
@n.1.8e9-where's-my-sharem. It seems the try-and-see approach is sometimes inevitable when using the toolchain. Btw, I updated my post and quoted your comments. If you feel it inappropriate, I can remove them.Ensample
I
2

these subtle behaviors

There isn't really anything subtle here.

  1. A weak definition means: use this symbol unless another strong definition is also present, in which case use the other symbol.

    Normally two same-named symbols result in a multiply-defined link error, but when all but one definitions are weak, no multiply-defined error is produced.

  2. A weak (unresolved) reference means: don't consider this symbol when deciding whether to pull an object which defines this symbol out of archive library or not (an object may still be pulled in if it satisfies a different strong undefined symbol).

    Normally if the symbol is unresolved after all objects are selected, the linker will report unresolved symbol error. But if the unresolved symbol is weak, the error is suppressed.

That's really all there is to it.

Update:

You are repeating incorrect understanding in comments.

What makes me feel subtle is, for a weak reference, the linker doesn't pull an object from an archive library, but still check a standalone object file.

This is entirely consistent with the answer above. When a linker deals with archive library, it has to make a decision: to select contained foo.o into the link or not. It is that decision that is affected by the type of reference.

When bar.o is given on the link line as a "standalone object file", the linker makes no decisions about it -- bar.o will be selected into the link.

And if that object happens to contain a definition for the weak reference, will the weak reference be also resolved by the way?

Yes.

Even the weak attribute tells the linker not to.

This is the apparent root of misunderstanding: the weak attribute doesn't tell the linker not to resolve the reference; it only tells the linker (pardon repetition) "don't consider this symbol when deciding whether to pull an object which defines this symbol out of archive library".

I think it's all about whether or not an object containing a definition for that weak reference is pulled in for linking.

Correct.

Be it a standalone object or from an archive lib.

Wrong: a standalone object is always selected into the link.

Ingoing answered 10/8, 2021 at 6:48 Comment(8)
Thanks. What makes me feel subtle is, for a weak reference, the linker doesn't pull an object from an archive library, but still check a standalone object file. I don't know why it is designed this way.Ensample
And if an object (from a lib archive) satisfies a different strong undefined symbol, it is still pulled in. And if that object happens to contain a definition for the weak reference, will the weak reference be also resolved by the way? Even the weak attribute tells the linker not to. I don't know yet but I will try it out. It may make things even more subtle.Ensample
I think it's all about whether or not an object containing a definition for that weak reference is pulled in for linking. Be it a standalone object or from an archive lib.Ensample
Thanks for your updating. For a weak attribute on declaration, "Don't resolve a symbol" and "Don't consider this symbol when deciding whether to pull an object from an archive" do differ.Ensample
@Ensample Yes: they are different. "Don't resolve" is plain wrong. I don't know where that came from.Ingoing
Please see the 1st comment to my question...I guess I need to search for some authoritative reference, or directly check the linker's code... After all, all secrets lie in the implementation...Ensample
@Ensample That comment is wrong, as you have trivially observed yourself. What n...m probably meant is "if this symbol remains unresolved, that's not an error".Ingoing
That can explain why there's no link time error but run time error. I can't believe a linker can allow this to happen. I always believe a linker should resolve everything.Ensample

© 2022 - 2024 — McMap. All rights reserved.