abi::__cxa_demangle -- why buffer needs to be `malloc`-ed?
Asked Answered
S

2

5

The documentation of abi::__cxa_demangle (such as https://gcc.gnu.org/onlinedocs/libstdc++/libstdc++-html-USERS-4.3/a01696.html) specifies that the second argument, char * output_buffer, need to be malloc-ed.

That means that a character buffer allocated on the stack such as the following is not allowed.

  enum {N = 256};
  char output_buffer[N];
  size_t output_length = 0;
  int status = -4;
  char * const result = std::__cxa_demangle(mangled_name,
                        output_buffer, &output_length, &status);

Two questions:

  1. Why is an output_buffer on stack not allowed?

  2. Why is a different pointer returned when an output buffer was already passed?

Influenced by the example of backtrace(), I would have imagined an API like the following

// Demangle the symbol in 'mangled_name' and store the output
// in 'output_buffer' where 'output_buffer' is a caller supplied
// buffer of length 'output_buffer_length'. The API returns the 
// number of bytes written to 'output_buffer' which is not
// greater than 'output_buffer_length'; if it is
// equal to 'output_buffer_length', then output may have been
// truncated.
size_t mydemangle(char const * const mangled_name,
                  char * output_buffer,
                  size_t const output_buffer_length);
Sjambok answered 10/7, 2017 at 22:17 Comment(5)
If something needs to be malloc'd, it's usually because something else is going to call free or realloc on it.Mvd
"Why is an output_buffer on stack not allowed?" - "If output_buffer is not long enough, it is expanded using realloc. ".Stunsail
Exactly, From the link you provided. If output_buffer is not long enough, it is expanded using reallocWinger
@DanielKamilKozar The answer section is below.Tamatamable
Thank you all. My apologies, I did not read carefully and overlooked the realloc part.Sjambok
W
8

1) Why is an output_buffer on stack not allowed?

From the link you provided. If output_buffer is not long enough, it is expanded using realloc. It is not possible to resize data on the stack since a stack frame is generally of fixed size (a special case alloca)

2) Why is a different pointer returned when an output buffer was already passed?

When realloc is used, there's no reason to think you will get back the same pointer. For example, if there is not enough contiguous memory free at that location the operating system would need to allocate the memory somewhere else.

If I had to guess why the API was designed this way, it would be that it's usually considered good practice to not allocate memory in a function and then return references to that memory. Instead, make the caller responsible for both allocation and deallocation. This is helps avoid unexpected memory leaks and allows a user of the API to design their own memory allocation schemes. I appreciate such things because it allows the user to utilize their own memory management schemes to avoid things like memory fragmentation. The potential use of realloc kind of messes this idea up though, but you could probably work around this by allocating large enough blocks for the output parameter so that realloc is never called.

Winger answered 10/7, 2017 at 22:24 Comment(0)
M
4
  1. Why is an output_buffer on stack not allowed?
  2. Why is a different pointer returned when an output buffer was already passed?

Because c++ classnames can be arbitrarily long.

Try this:

#include <iostream>
#include <cxxabi.h>
#include <utility>

using foo = std::make_index_sequence<10000>;

int main()
{
    size_t buff_size = 128;
    auto buff = reinterpret_cast<char*>(std::malloc(buff_size));
    std::cout << "buffer before: " << static_cast<void*>(buff) << std::endl;
    int stat = 0;
    buff = abi::__cxa_demangle(typeid(foo).name(), buff, &buff_size, &stat);
    std::cout << "buffer after: " << static_cast<void*>(buff) << std::endl;
    std::cout << "class name: " << buff << std::endl;
    std::free(buff);
}

Sample Output:

buffer before: 0x7f813d402850
buffer after: 0x7f813e000000
class name: std::__1::integer_sequence<unsigned long, 0ul, 1ul, 2ul, 3ul, 4ul, 5ul, 6ul, 7ul, 8ul, 9ul, 10ul, 11ul, 12ul, 13ul, 14ul, 15ul, 16ul, 17ul, ... and so on...
Mostly answered 10/7, 2017 at 23:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.