GCC: Empty program == 23202 bytes?
Asked Answered
N

10

18
test.c:

int main()
{
    return 0;
}

I haven't used any flags (I am a newb to gcc) , just the command:

gcc test.c

I have used the latest TDM build of GCC on win32. The resulting executable is almost 23KB, way too big for an empty program.

How can I reduce the size of the executable?

Neomaneomah answered 22/8, 2009 at 12:57 Comment(10)
One suggestion: Do you get the same results using the minGW build of GCC? I'm not sure if that size is unusual or not, as I'm not very used to C++ either.Examinee
UPX? upx.sourceforge.netStony
Yeah, I know UPX, but the problem here is this: the compiler shouldn't generate ~23KB of junk for an empty program.Neomaneomah
Try "strip" on the executable.Safeconduct
Oh, I love C++ too :), too bad my company forces me to use C#.Neomaneomah
@Richard: use -O2 or -O3 to activate optimizations.Deberadeberry
How exactly is 23KB "way too big"? Do you need to print out the executable or something? How exactly is this a problem? Since the program is basically empty, these 23KB are effectively a one-time cost. It doesn't mean that a slightly larger program will take up 46KB. Assuming your program grows to, say, 5MB, why would you even care about reducing the size by 23KB?Tumer
1. Because I am interested in fine-tuning the compilation process, 2. Because I knew the normal overhead should not be ~23KB, 3. Because I can.Neomaneomah
Eh - if the answer is "Because I can", why are you asking the question how? Shouldn't it be "Because I want to" ?Truthvalue
I don't understand all the negative backlash to this question. Even if it's not relevant to your applications, it might be relevant to others, especially embedded systems developers. And even if you decide to do nothing about this padded code, it can still be instructive to understand why it's there. While it's for a different environment/toolset, I recommend this article: msdn.microsoft.com/en-us/magazine/cc301696.aspxGurl
Y
39

Don't follow its suggestions, but for amusement sake, read this 'story' about making the smallest possible ELF binary.

Yucca answered 22/8, 2009 at 13:48 Comment(4)
Shit, this wasn't supposed to be taken seriously. It's now the most up-voted answer I've given!Yucca
The article linked in #553529 is also interesting.Gt
@Novelocrat Yeah, I upvoted because the link you posted was very interesting, not because I think the OP should do anything like this. I hope most of the other upvotes were for the same reason.Educable
Finally, another answer of mine has outpaced this. Now I needn't be so ashamed.Yucca
C
21

How can I reduce its size?

  • Don't do it. You just wasting your time.
  • Use -s flag to strip symbols (gcc -s)
Cornwell answered 22/8, 2009 at 13:8 Comment(0)
C
12

By default some standard libraries (e.g. C runtime) linked with your executable. Check out keys --nostdlib --nostartfiles --nodefaultlib for details. Link options described here.

For real program second option is to try optimization options, e.g. -Os (optimize for size).

Cysteine answered 22/8, 2009 at 13:8 Comment(4)
That's right. These keys I've used only for embedded systems.Cysteine
What do you recommend to start with? (I am new to GCC, but I have used C a lot in VisualCpp before)Neomaneomah
If you're familiar with C it is appropriate to start from learning differences between gcc and VisualCpp.Cysteine
Exactly, Kristof. It's rather pointless. Learning how to make empty programs as small as possible doesn't necessarily translate into knowledge of how to make non-trivial programs small. All you're left with is a bunch of empty programs. Focus on getting something worth fine-tuning, first.Carpospore
Q
12

Give up. On x86 Linux, gcc 4.3.2 produces a 5K binary. But wait! That's with dynamic linking! The statically linked binary is over half a meg: 516K. Relax and learn to live with the bloat.

And they said Modula-3 would never go anywhere because of a 200K hello world binary!


In case you wonder what's going on, the Gnu C library is structured such as to include certain features whether your program depends on them or not. These features include such trivia as malloc and free, dlopen, some string processing, and a whole bucketload of stuff that appears to have to do with locales and internationalization, although I can't find any relevant man pages.

Creating small executables for programs that require minimum services is not a design goal for glibc. To be fair, it has also been not a design goal for every run-time system I've ever worked with (about half a dozen).

Quiteris answered 22/8, 2009 at 18:59 Comment(0)
R
9

Actually, if your code does nothing, is it even fair that the compiler still creates an executable? ;-)

Well, on Windows any executable would still have a size, although it can be reasonable small. With the old MS-DOS system, a complete do-nothing application would just be a couple of bytes. (I think four bytes to use the 21h interrupt to close the program.) Then again, those application were loaded straight into memory. When the EXE format became more popular, things changed a bit. Now executables had additional information about the process itself, like the relocation of code and data segments plus some checksums and version information. The introduction of Windows added another header to the format, to tell MS-DOS that it couldn't execute the executable since it needed to run under Windows. And Windows would recognize it without problems. Of course, the executable format was also extended with resource information, like bitmaps, icons and dialog forms and much, much more.

A do-nothing executable would nowadays be between 4 and 8 kilobytes in size, depending on your compiler and every method you've used to reduce it's size. It would be at a size where UPX would actually result in bigger executables! Additional bytes in your executable might be added because you added certain libraries to your code. Especially libraries with initialized data or resources will add a considerable amount of bytes. Adding debug information also increases the size of the executable.

But while this all makes a nice exercise at reducing size, you could wonder if it's practical to just continue to worry about bloatedness of applications. Modern hard disks will divide files up in segments and for really large disks, the difference would be very small. However, the amount of trouble it would take to keep the size as small as possible will slow down development speed, unless you're an expert developer whom is used to these optimizations. These kinds of optimizations don't tend to improve performance and considering the average disk space of most systems, I don't see why it would be practical. (Still, I do optimize my own code in similar ways but then again, I am experienced with these optimizations.)


Interested in the EXE header? It's starts with the letters MZ, for "Mark Zbikowski". The first part is the old-style MS-DOS header for executables and is used as a stub to MS-DOS saying the program is not an MS-DOS executable. (In the binary, you can find the text 'This program cannot be run in DOS mode.' which is basically all it does: displaying that message. Next is the PE header, which Windows will recognise and use instead of the MS-DOS header. It starts with the letters PE for Portable Executable. After this second header there will be the executable itself, divided in several blocks of code and data. The header contains special reallocation tables which tells the OS where to load a specific block. And if you can keep this to a limit, the final executable can be smaller than 4 KB, but 90% would then be header information and no functionality.
Rickrickard answered 22/8, 2009 at 13:44 Comment(4)
As for a DOS application, a simple ret will do. That is, 1 byte.Castillo
A ret would do, but the official rule was that you had to call the "Exit" interrupt.Rickrickard
I've built real Windows executables (PE format) that do useful things in <4KB, using VS2005. So a do-nothing executable certainly doesn't have to be 8KB. (Why? Autorun checker for a CD, don't start a large installer EXE if app is already installed)Truthvalue
The code does not do nothing - it returns zero to the environment.Halftone
N
3

I like the way the DJGPP FAQ addressed this many many years ago:

In general, judging code sizes by looking at the size of "Hello" programs is meaningless, because such programs consist mostly of the startup code. ... Most of the power of all these features goes wasted in "Hello" programs. There is no point in running all that code just to print a 15-byte string and exit.

Napoleonnapoleonic answered 22/8, 2009 at 13:48 Comment(2)
The whole point of the empty program is to see the overhead. I'am simply interested how the compilation works, what ends up in a compiled binary aside from the code I put there.Neomaneomah
Richard, that's not at all what you asked in your question. You asked how to get rid of the overhead. You didn't ask what the overhead consisted of.Carpospore
K
2

What is the purpose of this exercise?

Even with as low a level language as C, there's still a lot of setup that has to happen before main can be called. Some of that setup is handled by the loader (which needs certain information), some is handled by the code that calls main. And then there's probably a little bit of library code that any normal program would have to have. At the least, there's probably references to the standard libraries, if they are in dlls.

Examining the binary size of the empty program is a worthless exercise in and of itself. It tells you nothing. If you want to learn something about code size, try writing non-empty (and preferably non-trivial) programs. Compare programs that use standard libraries with programs that do everything themselves.

If you really want to know what's going on in that binary (and why it's so big), then find out the executable format get a binary dump tool and take the thing apart.

Kosher answered 22/8, 2009 at 13:46 Comment(4)
Given that you don't know the OP's motivations, that's simply not true. He might be interested in getting into embedded development, where code size matters a lot, for instance.Yucca
Code size of the empty program is still completely irrelevant. And if he's into embedded programming where size of program matters, then anything he does fooling with a windows compiler is irrelevant.Kosher
Code size of an empty program is not irrelevant when 1. you code demos, 2. you are interested in how the compilation work, what ends up in the final executable, 3. and finally when you know that an empty program should not be ~23KB. There might be no obvious uses of something like this, but it doesn't make learning about the compiler flags irrelevant.Neomaneomah
Richard, why do you code empty programs as demos? And if you don't know what's in the final executable, then how do you know it shouldn't be 23 K? And if you haven't learned why it was 23 K, then perhaps it's because you never asked.Carpospore
H
2

What does 'size a.out' tell you about the size of the code, data, and bss segments? The majority of the code is likely to be the start up code (classically crt0.o on Unix machines) which is invoked by the o/s and does set up work (like sorting out command line arguments into argc, argv) before invoking main().

Halftone answered 24/8, 2009 at 22:28 Comment(0)
I
1

Run strip on the binary to get rid of the symbols. With gcc version 3.4.4 (cygming special) I drop from 10k to 4K.

You can try linking a custom run time (The part that calls main) to setup your runtime environment. All programs use the same one to setup the runtime environment that comes with gcc but for your executable you don't need data or zero'ed memory. The means you could get rid of unused library functions like memset/memcpy and reduce CRT0 size. When looking for info on this look at GCC in embedded environment. Embedded developers are general the only people that use custom runtime environments.

The rest is overheads for the OS that loads the executable. You are not going to same much there unless you tune that by hand?

Islas answered 24/8, 2009 at 8:2 Comment(0)
Y
0

Using GCC, compile your program using -Os rather than one of the other optimization flags (-O2 or -O3). This tells it to optimize for size rather than speed. Incidentally, it can sometimes make programs run faster than the speed optimizations would have, if some critical segment happens to fit more nicely. On the other hand, -O3 can actually induce code-size increases.

There might also be some linker flags telling it to leave out unused code from the final binary.

Yucca answered 22/8, 2009 at 14:10 Comment(1)
Unsurprising, in this case. There's not much code that GCC is actually touching here.Yucca

© 2022 - 2024 — McMap. All rights reserved.