How to control memory allocation strategy in third party library code?
Asked Answered
C

3

38

Previous header: "Must I replace global operators new and delete to change memory allocation strategy in third party code?"

Short story: We need to replace memory allocation technique in third-party library without changing its source code.

Long story:

Consider memory-bound application that makes huge dynamic allocations (perhaps, almost all available system memory). We use specialized allocators, and use them everywhere (shared_ptr's, containers etc.). We have total control and power over every single byte of memory allocated in our application.

Also, we need to link against a third-party helper library. That nasty guy makes allocations in some standard way, using default operators new, new[], delete and delete[] or malloc or something else non-standard (let's generalize and say that we don't know how this library manages it's heap allocation).

If this helper library makes allocation that are big enough we can get HDD thrashing, memory fragmentation and alignments issues, out-of-memory bad_allocs and all sorts of problems.

We can not (or do not want) to change library source code.

First attempt:

We never had such unholy "hacks" in release builds before. First test with overriding operator new works fine, except that:

  • we do not know what gotchas wait us in the future (and this is awful)
  • our users (and even our allocators) now have to allocate same way that we do

Questions:

  1. Are there ways to hook these allocations without overloading global operators? (local lib-only hooks?)
  2. ...and if we don't know what exactly it uses: malloc or new?
  3. Is this list of signatures complete? (and there are no other things that we must implement):

    void* operator new (std::size_t size) throw (std::bad_alloc);
    void* operator new (std::size_t size, const std::nothrow_t& nothrow_value) throw();
    void* operator new (std::size_t size, void* ptr) throw();
    void* operator new[] (std::size_t size) throw (std::bad_alloc);
    void* operator new[] (std::size_t size, const std::nothrow_t& nothrow_value) throw();
    void* operator new[] (std::size_t size, void* ptr) throw();
    
    void operator delete (void* ptr) throw();
    void operator delete (void* ptr, const std::nothrow_t& nothrow_constant) throw();
    void operator delete (void* ptr, void* voidptr2) throw();
    void operator delete[] (void* ptr) throw();
    void operator delete[] (void* ptr, const std::nothrow_t& nothrow_constant) throw();
    void operator delete[] (void* ptr, void* voidptr2) throw();
    
  4. Something different if that library is dynamic?

Edit #1

Cross-platform solution is preferable if possible (looks like not very possible). If not, our major platforms:

  • Windows x86/x64 (msvc 10)
  • Linux x86/x64 (gcc 4.6)

Edit #2

Almost 2 years have passed, few OS and compiler versions have evolved, so I am curious if there is something new and unexplored in this area? Any standard proposals? OS-specifics? Hacks? How do you write memory-thirsty applications today? Please share your experience.

Caloric answered 4/5, 2013 at 19:14 Comment(10)
Answer depends on which compiler / OS you are using.Thionate
look at jemalloc: github.com/jemalloc/jemalloc/wiki/Getting-Started this is a custom allocator library. the right thing to do is as they do overload malloc and let the linker link correctly to their implementation. note however for this to work they need to use malloc or new...Tidings
Whoever owns main gets the privilege of overloading operator new/delete and nobody else. Otherwise you're just building in run time errors for when somebody accidentally transfers memory ownership across a border they shouldn't have. Sorry to be the bearer of bad news.Gunilla
At least on Windows for your own programs, there are a lot of tools for example: msdn.microsoft.com/en-us/library/974tc9t1.aspx but for 3rd party programs, it's another story. Starting programs under a debugger uses a Debug Heap that helps finding bugs with a debugger such as WinDbg: msdn.microsoft.com/en-us/library/974tc9t1.aspx. Otherwise there are some tools provided by Microsoft: msdn.microsoft.com/en-us/library/windows/hardware/…Umont
Do you need to replace the allocators or only track the allocated memory size? As for your list - it will not replace new and delete declared by classes. That's a common technique to solve dll boundary issue.Magdamagdaia
@Magdamagdaia Neither. All we need is to gain total control over heap, while linking against some library that does not provide memory allocation interface. I edit the question to make it more clear. Overriding new was just a first thing that comes in mind and I would be glad to avoid it. Maybe there is a technique to cut the "nasty" allocations somewhere else? Maybe we can sandbox library's heap somehow? We are looking for broad area of solutions.Caloric
@SimonMourier I'm not sure how debug heap applies. Debugging is not an issue for us. Can something like this work in Production? How do we isolate allocations in foreign library's code? Please add more details if you have something in mind. We are desperately looking for any hacks.Caloric
In "production" (whatever that be) there is no general way to do it as everything is optimized. If I ship a component to you to be included in your program, I don't want you to mess with my component in production (and possibly complain it does not work as fast as I promise it for example). But you can validate your program prior to production deployment, including my component, and I'd be ok with this. As an example, that's the same approach MS uses for driver verification: msdn.microsoft.com/en-us/library/windows/hardware/ff545448.aspxUmont
Not sure what kind of control in addition to replacing allocators is possible. Do you think of failing the library allocations if they are too large? It's interesting to know what the total control means.Magdamagdaia
Related: How to control memory allocation of a third party library? (NUMA-awareness)Caloric
D
21

Ugh, my sympathy. This is going to depend a lot on your compiler, your libc, etc. Some rubber-meets-road strategies that have "worked" to varying degrees for us in the past (/me braces for downvotes) are:

  • The operator new / operator delete overloads you suggested -- although note that some compilers are picky about not having throw() specs, some really want them, some want them for new but not for delete, etc (I have a giant platform-specific #if/#elif block for all of the 4+ platforms we're working on now).
  • Also worth noting: you can generally ignore the placement versions, they don't allocate.
  • Look at __malloc_hook and friends -- note that these are deprecated and have thread race conditions -- but they're nice in that new/delete tend to be implemented in terms of malloc (but not always).
  • Providing a replacement malloc, calloc, realloc, and free and getting your linker args in the right order so that the overrides take place (this is what gcc recommends these days, although I've had situations where it was impossible to do, and I had to use deprecated __malloc_hook) -- again, new and delete tend to be implemented in terms of these, but not always.
  • Avoiding all the standard allocation methods (operator new, malloc, etc) in "our code" and using custom functions instead -- not very easy with existing codebase.
  • Tracking down the library author and delivering a savage beating polite request or patch to change their library to allow you to specify a different allocator (it may be faster than doing this yourself) -- I think this has lead to a cardinal rule of "client always specifies the allocator or does the allocation" with any libraries I write.

Please note that this is not an answer in terms of what the standards say should happen, just my experience. I've worked with more than a few buggy/broken compilers and libc implementations in the past, so YMMV. I also have the luxury of working on fairly "sealed systems", and not being all that worried about portability for any specific application.

Regarding dynamic libraries: I'm currently in a bit of a pinch in this regard myself; our "app" gets loaded as a dynamic .so and we have to be pretty careful to pass any delete/free requests back to the default allocator if they didn't come from us. The current solution is to just cordon off our allocations to a specific area: if we get a delete/free from within that address range, we dispatch to our handler, otherwise back to the default... I've even toyed with (horrors) the idea of checking the caller address to see if it's in our address space. (The probability of going boom increases with such hacks, though.)

This may be a useful strategy even if you are the process lead and you're using an outside library: tag or restrict or otherwise identify your own allocs somehow (even going so far as to keep a list of allocs you know about), and then pass on any unknowns. All of this has ugly side-effects and limitations, though.

(Looking forward to other answers!)

Digitiform answered 4/5, 2013 at 19:42 Comment(3)
(I know this is not a very good answer, and doesn't even directly address your particular case or all of the 4 questions you raised, but hopefully it will be at least a little helpful.)Digitiform
A big piece of food for thought. Thanks for sharing!Caloric
I agree so much on the last bullet point. Libraries should not make assumptions about how to allocate memory. (And I wish C++ provided stronger tools to manage this). +1Templetempler
R
2

Without being able to modify the library's source code - or, better, being able to influence the author of the library to modify it - I'd say you're out of luck.

There are some things the library potentially can do (even unintentionally) to make it immune to any strategy you might employ - or, in worst cases, have the result that your usage would make the library unstable or it might make your program unstable. Such as using its own custom allocators, providing its own versions of global operator new() and operator delete(), overriding those operators in individual classes, etc.

A strategy which would probably work is to work with the library vendor and make some modifications. The modifications (from your end) would amount to being able to initialise the library by specifying allocators it uses. For the library the effort is potentially significant (having to touch all functions that dynamically allocate memory, that use standard containers, etc) but not intractable - use the supplied allocators (or sensible defaults) throughout their code.

Unfortunately, that is at odds with your requirement to not modify the library - I am skeptical of the chances of satisfying that, particularly within constraints you have outlined (memory-thirsty, hosted on windows/linux, etc).

Richmound answered 3/1, 2016 at 12:47 Comment(0)
B
0

Cant be done for allocation made within that class library but you can use placement new to allocate classes from that third party library i.e. you can allocate the memory and have constructors of those classes called on the allocated memory.So this way even if the class has its own new operator it wouldnt get called .Howvwer , inside the class operations memory allocations to unexposed internal classes or primitives will be done using the allocation scheme of the third party library ; that cant be changed unless third party library allows you to specify an allocator like stl containers

Bayles answered 10/3, 2017 at 16:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.