What is Linux utility to mangle a C++ symbol name?
Asked Answered
I

5

55

I have c++filt command to demangle a symbol, what is the tool to do the opposite and mangle a symbol name?

This would be useful if I were to want to call dlsym() on a mangled C++ function name. I'd rather not hard code the name mangling in the code since it could change over time due to new complier versions or new compiler brands being used or at present due to compiling for multiple platforms.

Is there a programatic way to get the string that represents a C++ function at runtime so that the code is compiler independent? One way to possibly do this would be to call a utility at compile time that performs the name mangling for the compiler being used and inserts the appropriate mangled C++ symbol name into a string for dlsym() to use.

Here is the closest to a solution I've found on this site which is accomplished by using a fixed C style name to indirect to C++ symbols that are defined in the library you wish to dlsym(), but if you do not have control over what that library provides, this is not an option.

Intolerable answered 4/7, 2012 at 21:28 Comment(9)
Everybody uses a C++ compiler to mangle names.Hollow
@ShafikYaghmour Yes, I would like a better answer, please do give one if you have it. Using the compiler seems like a pretty large hammer for this job, so a simple thing like c++filt in reverse would be great!Intolerable
@Intolerable Can you explain your use case?Elliotelliott
I don't think you will find a nice tool but you may find this: int0x80.gr/papers/name_mangling.pdf and this: https://mcmap.net/q/339548/-c-name-mangling-by-hand helpful the tool mentioned at the bottom of this page sounds almost what you want: llvm.1065342.n5.nabble.com/C-Name-mangling-td57564.html but you may want but not straight forwardElliotelliott
@ShafikYaghmour I've added more to the question to try and explain the use case as best as I can recall from last year.Intolerable
+1 Now that I understand why, your problem makes a lot more sense.Elliotelliott
@HansPassant How should I use a c++ compiler to mangle a name for use with ldsym ?Heeltap
@BlueRaja-DannyPflughoeft: If you believe the compiler solves this problem, I suggest you explain how. The C++-compiler mangles the names upon compilation, but there does not seem to be a function to call to obtain the mangled name at runtime from a user program. The actual mangled name is necessary in order to dynamically load a C++ symbol with dlsym! For the original question: After searching far and wide, it seems that after 30 years of C++, there is still no solution to the problem of dynamically loading symbols. The "solution" is to use a C interface in front of the C++ one.Hippocras
This became an "unanswered question" due to recent edits. Here is an answer for this question - https://mcmap.net/q/339549/-getting-mangled-name-from-demangled-name - the questions aren't duplicate, or I'd mark dupeLheureux
R
5

You may be able to get what you want by looking at the symbol table of the .so you are looking at: Someone else answered this already Returning a shared library symbol table.

However, if there are too many symbols ... that may not work.
So here's a crazy idea. Caveat emptor!

A potential solution is to:

  1. create a file with a stub with exactly one name: the name you want: void myfunction() { }

  2. compile that file (with -fPIC and -shared so it's a dynamic library)

  3. call dlopen/dlsym on that particular file

  4. Iterate through the symbols (there should just be only the one want plus other regular junk you can filter). Iterating through the symbols is clumsy, but you can do it: Returning a shared library symbol table

  5. dlclose() to free it up (lose the stub out of your symbols)

  6. Open the file you want with dlopen

Basically, you would invoke the compiler from your code, it would create a .so you could look at, get the only value out, then unload that .so so you could load in the one you want.

It's crazy.

Rutaceous answered 21/8, 2015 at 21:29 Comment(0)
E
7

That's how g++ mangles names. You might implement those mangling rules on your program.

Another (crazy) solution would be to list all of the symbols in the library you want to use (it's not so difficult if you understand the format), demangle them all, and search your function's name in that list. The advantage with this method is that demangling is easier, as there is a function call to do it: abi::__cxa_demangle, from cxxabi.h header.

Einberger answered 23/8, 2015 at 3:37 Comment(2)
broken link! I found this reference: refspecs.linuxbase.org/cxxabi-1.83.html#manglingSycophancy
given an unmangled name and if you know in which shared library the symbol of interest is defined, you can try get the address of the unmangled name from nm -C filename.so and then look for the same address in the output of `nm filename.so' to get the mangled nameSimoneaux
R
5

You may be able to get what you want by looking at the symbol table of the .so you are looking at: Someone else answered this already Returning a shared library symbol table.

However, if there are too many symbols ... that may not work.
So here's a crazy idea. Caveat emptor!

A potential solution is to:

  1. create a file with a stub with exactly one name: the name you want: void myfunction() { }

  2. compile that file (with -fPIC and -shared so it's a dynamic library)

  3. call dlopen/dlsym on that particular file

  4. Iterate through the symbols (there should just be only the one want plus other regular junk you can filter). Iterating through the symbols is clumsy, but you can do it: Returning a shared library symbol table

  5. dlclose() to free it up (lose the stub out of your symbols)

  6. Open the file you want with dlopen

Basically, you would invoke the compiler from your code, it would create a .so you could look at, get the only value out, then unload that .so so you could load in the one you want.

It's crazy.

Rutaceous answered 21/8, 2015 at 21:29 Comment(0)
B
2

Name mangling is implementation specific.

There is no standard for name mangling so your best bet is to find a compiler to do it for you.

Name mangling

There is a table here that may help you if you wish to do this manually

Buffer answered 19/8, 2015 at 16:24 Comment(0)
G
2

If you're using g++ on x86 or ARM then you can try this one(ish)-liner:

echo "<your-type> <your-name>(<your-parameters>) {}" \
| g++ -x c++ - -o - -S -w \
| grep '^_' \
| sed 's/:$//'

g++ invokes the front-end for the cc1plusplus compiler.
g++ -x c++ says to interpret the input language as C++.
g++ -x c++ - says to get the input from the stdin (the piped echo).
g++ -x c++ - -o - says to output to the stdout (your display).
g++ -x c++ - -o - -S says to output assembler/assembly language.
g++ -x c++ - -o - -S -w says to silence all warnings from cc1plusplus.

This gives us the raw assembly code output.

For x86(_64) or ARM(v7/v8) machines, the mangled name in the assembly output will start at the beginning of a line, prefixed by an underscore (_) (typically _Z).

Notably, no other lines will begin this way, so lines beginning with an underscore are guaranteed to be a code object name.

grep '^_' says to filter the output down to only lines beginning with an underscore (_).

Now we have the mangled names (one on each line--depending on how many you echoed into g++).

However, all the names in the assembly are suffixed by a colon (:) character. We can remove it with the Stream-EDitor, sed.

sed 's/:$//' says to remove the colon (:) character at the end of each line.

Lastly, a couple of concrete examples, showing mangling and then demangling for you to use as reference (output from an x86 machine):

Example 1:

echo "int MyFunction(int x, char y) {}" \
| g++ -x c++ - -o - -S -w \
| grep '^_' \
| sed 's/:$//'
_Z10MyFunctionic       # This is the output from the command pipeline

c++filt _Z10MyFunctionic
MyFunction(int, char)  # This is the output from c++filt

Example 2:

echo \
"\
namespace YourSpace { int YourFunction(int, char); }
int YourSpace::YourFunction(int x, char y) {}
"\
| g++ -x c++ - -o - -S -w \
| grep '^_' \
| sed 's/:$//'
_ZN9YourSpace12YourFunctionEic      # This is the output from the command pipeline

c++filt _ZN9YourSpace12YourFunctionEic
YourSpace::YourFunction(int, char)  # This is the output from c++filt

I originally saw how to apply g++ to stdin in Romain Picard's article:
How To Mangle And Demangle A C++ Method Name
I think it's a good read.

Hope this helped you.

Additional Info:
Primary source: GNU <libstdc++> Manual: Chapter 28 Part 3: Demangling

Gamic answered 23/6, 2022 at 8:53 Comment(1)
That works only for free functions, and only as long as your parameters and return type are built-in types. Try a function which takes a std::ostream, and it will quickly fall down.Majordomo
P
1

An easier method than the first posted. Write a little C++ program like:

#include <stdlib.h>

extern int doit(const char *toto, bool is);

int main(int argc, char *argv[])
{
  exit(doit (argv[0], true));
}

Build it with

# g++ -S test.cpp

And extract symbol name from assembler source

# cat test.s | grep call | grep doit | awk '{print $2}'

You get:

rcoscali@srjlx0001:/tmp/TestC++$ cat test.s | grep call | grep doit | awk '{print $2}'
_Z4doitPKcb
rcoscali@srjlx0001:/tmp/TestC++$ 

The doit symbol mangled is _Z4doitPKcb Use the compiler you plan to use because each compiler have its own name mangling rules (as it has been said before from one compiler to another these rules may change).

Have fun !

Prepotency answered 25/6, 2016 at 12:51 Comment(3)
Because compliers can vary, question is asking for a way at runtime to compute this dynamically.Intolerable
This entire logic should be constexpr'ed yesterdayTwain
@Aviv: Have you had any recent experience with a name mangling implementation which doesn't need you to actually invoke the compiler...?Cupidity

© 2022 - 2024 — McMap. All rights reserved.