How to reduce default C++ memory consumption?
Asked Answered
W

3

16

I have a server application written in C++. After startup, it uses about 480 KB of memory on x86 Linux (Ubuntu 8.04, GCC 4.2.4). I think 480 KB is an excessive amount of memory: the server isn't even doing anything yet, no clients have been connected to the server. (See also my comment below in which I explain why I think 480 KB is a lot of memory.) The only things the server does during initialization is spawning one or two threads, setting up a few sockets, and other simple things that aren't very memory-intensive.

Note that I'm talking about real memory usage, not VM size. I measured it by starting 100 instances of my server on an idle laptop and measuring the system memory usage with 'free' before and after starting the server instances. I've already taken filesystem cache and things like that into account.

After some testing it would appear that something in the C++ runtime is causing my server to use this much memory even if the server itself doesn't do anything. For example, if I insert

getchar(); return 0;

right after

int main(int argc, char *argv[]) {

then the memory usage is still 410 KB per instance!

My application depends only on Curl and Boost. I have a fair amount of experience with C programming and I know C libraries don't tend to increase memory consumption until I use them.

Other things that I've found:

  • A simple hello world C app consumes about 50 KB of memory.
  • A simple hello world C app linked to Curl, but otherwise not using Curl, consumes about 50 KB of memory as well.
  • A simple hello world C++ app (no Boost) consumes about 100 KB of memory.
  • A simple hello world C++ app that includes some Boost headers, but does not actually use Boost, consumes about 100 KB of memory. No Boost symbols when inspecting the executable with 'nm'.

My conclusion is therefore as follows:

  1. Gcc throws away unused Boost symbols.
  2. If my app uses Boost, then something in the C++ runtime (probably the dynamic linker) causes it to use a lot of memory. But what? How do I find out what these things are, and what can I do about them?

I remember some KDE discussions several years ago about C++ dynamic linker issues. The Linux C++ dynamic linker back then caused slow startup time in KDE C++ apps and large memory consumption. As far as I know those issues have since been fixed in C++ runtimes. But could something similar be the cause of the excessive memory consumption I'm seeing?

Answers from gcc/dynamic linking experts are greatly appreciated.

For those who are curious, the server in question is Phusion Passenger's logging agent: https://github.com/FooBarWidget/passenger/blob/master/ext/common/LoggingAgent/Main.cpp

Weeping answered 14/11, 2010 at 16:53 Comment(15)
"boost" isn't very specific, it's only "a library" in the sense that you can get it in one zip file, really it's a huge bundle of libraries. If all you use is boost::shared_ptr you might get away with less than 350K extra, so any attempt to cut down your overhead has to be specific to what you're using.Abate
I'm not really seeing the problem in 480kb/instance. Perhaps you should worry about your memory usage in actual use cases, rather than when it's doing nothing.Calculator
I'm not sure what you're seeing is a real problem. It's true, 480kB is a lot of RAM for a "hello world" application to use, but modern systems aren't optimized to run "hello world" as efficiently as possible. They're optimized to run useful applications. So the more relevant question shouldn't be "how do I make 'hello world' smaller", but rather "is my actual application using too much memory, and if so, how can I reduce that"?Harpsichord
I'd also file this into the 'premature optimization' folder...Shote
@Steve: It's mostly just shared_ptr, thread, function and bind.Weeping
@DeadMG, @Jeremy: True, 480 KB is not much, but my software has to run on memory-limited VPSes where every megabyte may count. I also want to have bragging rights for lowest footprint of all software in this class. And the thing is, I already know how to optimize normal memory usage, but now it seems the runtime is using a constant amount of additional memory over which I have little control.Weeping
@Weeping IIRC boost::bind is a real hog. Are you linking to boost and the C++ stdlib dynamically? Also, look at the asm listing and see how much space exception handling is using. It could be bloating your stack use. I'm not sure for Linux, but exception handling on Windows can use ALOT of stack.Amorete
@JimR: I only link to boost-thread, everything else I use is header-only. But how boost use stack space even before I call any boost functions?Weeping
@Hongli: as long as those 480KB are constant overhead, what exactly is the problem? Anyway, try linking statically to the libraries you use (and to the C++ runtime, if possible). But mainly, I think this is premature optimization, and not just because 500kb don't matter, but also because there might be no difference in a bigger application. The memory being used up front might save the app from having to allocate an additional 480KB later. You don't know that this is overhead in a real application at all.Tomboy
@Hongli: But DOES your application use too much memory in that scenario? You haven't tested the actual use cases.Calculator
@DeadMG: That depends on the number of clients (memory usage goes up linearly with the number of clients). However actual scenario memory usage is irrelevant in this question; in this question I just want to know how to minimize the constant amount of dirty memory imposed by the C++ runtime, I already know how to deal with the rest without consulting StackOverflow.Weeping
@Hongli: How can you demonstrate that the existing memory usage is not just pre-allocation? The amount of memory consumed by hello, world is a truly insignificant and irrelevant discussion.Calculator
@DeadMG: I can't demonstrate that it's not pre-allocation, that's why I'm asking. The amount of memory consumed by hello world is relevant for me, otherwise I wouldn't be asking!Weeping
@Hogli: Is it? Why? If that IS pre-allocation, then it's completely and utterly meaningless, and if you don't know that it isn't, then how exactly is it relevant?Calculator
@DeadMG: It is relevant exactly because I don't know. If I cannot get that 400 KB back then at the very least I want to know where it comes from. If something is being preallocated, then what is being preallocated and why? If I don't want to understand the low-level stuff then I wouldn't be using C++.Weeping
H
6

The C runtime allocates more memory than your process actually uses as part of normal operation. This is because allocating memory at the kernel level is extremely slow, and can only be done in page sized blocks (Page size is typically 4kb on x86 boxes, but it can be larger, and is usually 8kb or more on x64 machines).

Furthermore, when the C runtime receives an allocation request it cannot satisfy, it will often allocate more than is necessary, again, to remove the expense of going to the kernel most of the time.

Finally, if you're using boost goodies, they probably depend on some STL components, such as std::vector. These components allocate space for elements using std::allocator<T>, which in some instances will again allocate more space than is actually used. (In particular, node-based structures like std::map, std::set, and std::list usually do this to put the nodes of the list or tree together on the same memory page)

Long story short: Don't worry about this. Half a meg of memory isn't a lot by any stretch of the imagination (At least nowadays), and most of that is probably just amortizing the use of dynamic allocation functions. Write up your actual server, and if it's using too much memory, THEN look at ways of reducing memory usage.

EDIT: If the component of boost you're using happens to be asio, and you're using sockets, you should also know there's some memory consumed in order to maintain buffers for sockets too.

Hinkley answered 14/11, 2010 at 18:10 Comment(11)
Those buffers shouldn't be allocated unless sockets are actually constructed. His int main() { return getchar(); } program shouldn't include any socket buffets (unless he has global variables).Hildahildagard
I know that malloc() preallocates memory. But as I've already demonstrated, memory goes up even before my app calls malloc() at all.Weeping
@Ben: I have no global variables that cause heap allocation.Weeping
@Ben: His int main() { return getchar(); } isn't taking 400Kb of memory either. @Hongli: Are you sure? You've gone through all the boost calls you've used and there's no calls to any sort of dynamic allocation functions?Hinkley
I've just hand-inspected all my source files again. There are about 3 global variables with constructor: an empty std::list, a boost::mutex and an std::list::iterator. None of them should allocate any heap memory, or at the very least should allocate very little. I don't use asio.Weeping
@Hongli: My guess would be the std::list. You should probably be using vector instead anyway.Hinkley
std::list would allocate even less than std::vector initially, I think.Hildahildagard
@Billy: According to the OP, his int main() { return getchar(); } is taking 410KB. not quite the 480KB used by the complete code but a very substantial chunk.Hildahildagard
@Ben: Usually std::vector allocates nothing upon empty construction. OYOH, std::list usually has one of the custom allocators of which I spoke to cluster the list nodes on a single memory page. And I believe you misread the question -- he says specifically that a simple main as specified is taking 100kb. Only after he A. puts in the globals and B. links to boost does the memory use of the program grow. In any case, it really doesn't matter.Hinkley
@Ben: the list is empty all the time in my test. Anyway, I tried removing the list global; it only saved me 1 KB of dirty private.Weeping
@Billy: It sounds like it is the act of loading the boost library (.so) that bumps up the memory usage. It ought to be an extra few hundred KB shared between all processes using the library, not 300KB of private commit per-process.Hildahildagard
W
2

One way to reduce the memory consumption is to reduce thread stack size.

Regarding boost, as Steve Jessop commented, you have to be a bit more specific than "boost".

Worsham answered 14/11, 2010 at 18:1 Comment(2)
That will affect virtual address space utilization, it shouldn't affect private commit.Hildahildagard
@Ben Voigt That depends on how much stack you told the system to commit initially. Reduce the stack size and the size of the initial commit.Amorete
H
2

Sounds like you have a problem with a base address conflict on some of your dynamic load libraries. If they require relocation during load, they will be mapped in as private fixed-up copies.

Re-run prelink across your entire system. If the library loads at its preferred address, it will be mapped as shared memory and only cost one copy of the code no matter how many processes are using it.

BTW, prelink was also the fix for KDE.

Hildahildagard answered 14/11, 2010 at 18:25 Comment(7)
Where can I learn more about base address conflicts? And don't you mean prelink instead of preload?Weeping
The Windows equivalent is BIND.Vasily
The cygwin equivalent is rebase, and for native Windows apps it is built in to the SDK tools, you can specify the base address at link time when creating the DLL or later using editbin. Windows doesn't actually prelink, but getting the preferred base address or not is still the difference between a shared section and a private copy of the post-fixup image.Hildahildagard
I thought only Windows makes memory dirty upon relocating dynamic libraries in favor of CPU speed, and that ELF systems made an explicit choice to save memory over CPU by using PIC code. I guess I need to take another look at this topic.Weeping
@Hongli: The hit may be less with PIC (-fPIE), but some fixup will still be needed. Perhaps in the code linking to the dynamic library if not the .so itself. Anyway, ELF doesn't require PIC code, are you sure that -fPIE was used during compilation (of your application and shared library)?Hildahildagard
Is -fPIE different from -fPIC? My app isn't compiled with -fPIC/-fPIE but all shared libraries are. I thought only shared libraries need to be compiled with PIC.Weeping
Shared libraries don't need to be compiled with -fPIC, but doing so prevents relocations inside code pages (I assume there's still a few variables somewhere that need to be set to point to the library base address, but only one page instead of the whole code section). See sourceware.org/ml/glibc-linux/2000-q2/msg00067.html and especially the command objdump --dynamic-reloc foo.soHildahildagard

© 2022 - 2024 — McMap. All rights reserved.