How to catch unintentional function interpositioning?
Asked Answered
T

6

7

Reading through my book Expert C Programming, I came across the chapter on function interpositioning and how it can lead to some serious hard to find bugs if done unintentionally.

The example given in the book is the following:

my_source.c

mktemp() { ... }

main() {
  mktemp();
  getwd();
}

libc

mktemp(){ ... }
getwd(){ ...; mktemp(); ... }

According to the book, what happens in main() is that mktemp() (a standard C library function) is interposed by the implementation in my_source.c. Although having main() call my implementation of mktemp() is intended behavior, having getwd() (another C library function) also call my implementation of mktemp() is not.

Apparently, this example was a real life bug that existed in SunOS 4.0.3's version of lpr. The book goes on to explain the fix was to add the keyword static to the definition of mktemp() in my_source.c; although changing the name altogether should have fixed this problem as well.

This chapter leaves me with some unresolved questions that I hope you guys could answer:

  1. Does GCC have a way to warn about function interposition? We certainly don't ever intend on this happening and I'd like to know about it if it does.
  2. Should our software group adopt the practice of putting the keyword static in front of all functions that we don't want to be exposed?
  3. Can interposition happen with functions introduced by static libraries?

Thanks for the help.

EDIT

I should note that my question is not just aimed at interposing over standard C library functions, but also functions contained in other libraries, perhaps 3rd party, perhaps ones created in-house. Essentially, I want to catch any instance of interpositioning regardless of where the interposed function resides.

Tetraspore answered 6/5, 2010 at 16:53 Comment(0)
L
1

It sounds like what you want is for the tools to detect that there are name conflicts in functions - ie., you don't want your externally accessible function names form accidentally having the same name and therefore 'override' or hide functions with the same name in a library.

There was a recent SO question related to this problem: Linking Libraries with Duplicate Class Names using GCC

Using the --whole-archive option on all the libraries you link against may help (but as I mentioned in the answer over there, I really don't know how well this works or how easy it is to convince builds to apply the option to all libraries)

Lascivious answered 6/5, 2010 at 18:37 Comment(1)
Thanks for the answer, looks like you got to double dip on that one! =)Tetraspore
T
3

This is really a linker issue.

When you compile a bunch of C source files the compiler will create an object file for each one. Each .o file will contain a list of the public functions in this module, plus a list of functions that are called by code in the module, but are not actually defined there i.e. functions that this module is expecting some library to provide.

When you link a bunch of .o files together to make an executable the linker must resolve all of these missing references. This is the point where interposing can happen. If there are unresolved references to a function called "mktemp" and several libraries provide a public function with that name, which version should it use? There's no easy answer to this and yes odd things can happen if the wrong one is chosen

So yes, it's a good idea in C to "static" everything unless you really do need to use it from other source files. In fact in many other languages this is the default behavior and you have to mark things "public" if you want them accessible from outside.

Tropic answered 6/5, 2010 at 17:24 Comment(1)
I wish I could choose two answers because this an excellent one. I choose the other because it actually provided a solution. I hope this +1 will suffice.Tetraspore
W
1

Purely formally, the interpositioning you describe is a straightforward violation of C language definition rules (ODR rule, in C++ parlance). Any decent compiler must either detect these situations, or provide options for detecting them. It is simply illegal to define more than one function with the same name in C language, regardless of where these functions are defined (Standard library, other user library etc.)

I understand that many platforms provide means to customize the [standard] library behavior by defining some standard functions as weak symbols. While this is indeed a useful feature, I believe the compilers must still provide the user with means to enforce the standard diagnostics (on per-function or per-library basis preferably).

So, again, you should not worry about interpositioning if you have no weak symbols in your libraries. If you do (or if you suspect that you do), you have to consult your compiler documentation to find out if it offers you with means to inspect the weak symbol resolution.

In GCC, for example, you can disable the weak symbol functionality by using -fno-weak, but this basically kills everything related to weak symbols, which is not always desirable.

Wakerife answered 6/5, 2010 at 17:19 Comment(5)
Since this is C,, every function you call must be declared in some header file. So yes in principle the compiler should be able to detect functions of yours that have the same name (and thus would interpose) ones in any of the libraries you're using, and this will catch most of these problems. However because all symbols in a C source file are public by default the potential for interposing is always there in C if the writers of the libraries were lax about using "static". In other words there can very easily be public functions in a library that do not appear in its header file.Tropic
As long as the prototype declared in the header file matches the compiler would have no reason to generate an error. The only time I can imagine an error being generated if the linker notices multiple symbols with the same name.Prospero
@joefis: I don't see the connection to header files. You can define a function that "conflicts" with another function not declared in your translation unit at all. C does not scope linking space. All external symbols are piled up in one heap to be resolved. And again, the problem must exist for weak symbols only. Redefinition of normal (strong) symbols must be reported by the compiler (linker). If the compiler doesn't report, it is a serious problem with the compiler.Wakerife
A helpful system should include within the standard headers something to inform the linker that it should use a standard-library function and squawk if any duplicate exists. If code uses an identifier whose name later gets used for a standard-library function defined in a header that code does not #include, the linker should generally silently ignore the definition in the standard library (it may be desirable to have a "lint link" report it, but it should also be possible to get clean builds without having to rename the function in the source).Bowie
Alternatively (and IMHO this would in many ways be a better approach) a system could give any new standard-library functions reserved names, but then define macros that chain to them. So if a later version of the Standard library adds int foo(int), defined in <foo.h> it could include int __foo(int){...} in the library and #define foo(x) __foo(x) int foo(int) in <foo.h>. This would have the bonus that if code tries to use foo() without including <foo.h> it would generate a link error.Bowie
L
1

It sounds like what you want is for the tools to detect that there are name conflicts in functions - ie., you don't want your externally accessible function names form accidentally having the same name and therefore 'override' or hide functions with the same name in a library.

There was a recent SO question related to this problem: Linking Libraries with Duplicate Class Names using GCC

Using the --whole-archive option on all the libraries you link against may help (but as I mentioned in the answer over there, I really don't know how well this works or how easy it is to convince builds to apply the option to all libraries)

Lascivious answered 6/5, 2010 at 18:37 Comment(1)
Thanks for the answer, looks like you got to double dip on that one! =)Tetraspore
F
0

If the function does not need to be accessed outside of the C file it lives in then yes, I would recommend making the function static.

One thing you can do to help catch this is to use an editor that has configurable syntax highlighting. I personally use SciTE, and I have configured it to display all standard library function names in red. That way, it's easy to spot if I am re-using a name I shouldn't be using (nothing is enforced by the compiler, though).

Fermentation answered 6/5, 2010 at 17:3 Comment(0)
D
0

It's relatively easy to write a script that runs nm -o on all your .o files and your libraries and checks to see if an external name is defined both in your program and in a library. Just one of the many sane sensible services that the Unix linker doesn't provide because it's stuck in 1974, looking at one file at a time. (Try putting libraries in the wrong order and see if you get a useful error message!)

Deaton answered 7/5, 2010 at 2:49 Comment(1)
I was actually thinking about scripting nm and I think I may end up still doing that. But to your last question, look at ld's -( option to group libraries that can be repeatedly searched until all undefined symbols are found. I found this gem today whilst searching ld's manpage for an answer to my question.Tetraspore
A
0

The Interposistioning occurs when the linker is trying to link separate modules. It cannot occur within a module. If there are duplicate symbols in a module the linker will report this as an error.

For *nix linkers, unintended Interposistioning is a problem and it is difficult for the linker to guard against it. For the purposes of this answer consider the two linking stages:

  1. The linker links translation units into modulles (basically applications or libraries).
  2. The linker links any remaining unfound symbols by searching in modules.

Consider the scenario described in 'Expert C programming' and in SiegeX's question. The linker fist tries to build the application module. It sess that the symbol mktemp() is an external and tries to find a funcion definiton for the symbol. The linker finds the definition for the function in the object code of the application module and marks the symbol as found. At this stage the symbol mktemp() is completely resolved. It is not considered in any way tentative so as to allow for the possibility that the anothere module might define the symbol. In many ways this makes sense, since the linker should first try and resolve external symbols within the module it is currently linking. It is only unfound symbols that it searches for when linking in other modules. Furthermore, since the symbol has been marked as resolved, the linker will use the applications mktemp() in any other cases where is needs to resolve this symbol. Thus the applications version of mktemp() will be used by the library.

A simple way to guard agains the problem is to try and make all external sysmbols in your application or library unique. For modules that are only going to shared on a limited basis, this can fairly easily be done by making sure all extenal symbols in your module are unique by appending a unique identifier.

For modules that are widely shared making up unique names is a problem.

Advocaat answered 14/7, 2016 at 11:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.