Small Haskell program compiled with GHC into huge binary
Asked Answered
W

2

139

Even trivially small Haskell programs turn into gigantic executables.

I've written a small program, that was compiled (with GHC) to the binary with the size extending 7 MB!

What can cause even a small Haskell program to be compiled to the huge binary?

What, if anything, can I do to reduce this?

Weirick answered 24/5, 2011 at 19:0 Comment(5)
Have you tried just stripping it?Lamkin
Run the program strip on the binary to remove the symbol table.Lamkin
@tm1rbt: Run strip test. This command removes some debug information from the program and makes it smaller.Flashover
As an aside your data types in the 3D math library should be stricter for performance reasons: data M3 = M3 !V3 !V3 !V3 and data V3 = V3 !Float !Float !Float. Compile with ghc -O2 -funbox-strict-fields.Linger
This post is discussed on meta.Espresso
L
227

Let's see what's going on, try

  $ du -hs A
  13M   A

  $ file A
  A: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), 
     dynamically linked (uses shared libs), for GNU/Linux 2.6.27, not stripped

  $ ldd A
    linux-vdso.so.1 =>  (0x00007fff1b9ff000)
    libXrandr.so.2 => /usr/lib/libXrandr.so.2 (0x00007fb21f418000)
    libX11.so.6 => /usr/lib/libX11.so.6 (0x00007fb21f0d9000)
    libGLU.so.1 => /usr/lib/libGLU.so.1 (0x00007fb21ee6d000)
    libGL.so.1 => /usr/lib/libGL.so.1 (0x00007fb21ebf4000)
    libgmp.so.10 => /usr/lib/libgmp.so.10 (0x00007fb21e988000)
    libm.so.6 => /lib/libm.so.6 (0x00007fb21e706000)
    ...      

You see from the ldd output that GHC has produced a dynamically linked executable, but only the C libraries are dynamically linked! All the Haskell libraries are copied in verbatim.

Aside: since this is a graphics-intensive app, I'd definitely compile with ghc -O2

There's two things you can do.

Stripping symbols

An easy solution: strip the binary:

$ strip A
$ du -hs A
5.8M    A

Strip discards symbols from the object file. They are generally only needed for debugging.

Dynamically linked Haskell libraries

More recently, GHC has gained support for dynamic linking of both C and Haskell libraries. Most distros now distribute a version of GHC built to support dynamic linking of Haskell libraries. Shared Haskell libraries may be shared amongst many Haskell programs, without copying them into the executable each time.

At the time of writing Linux and Windows are supported.

To allow the Haskell libraries to be dynamically linked, you need to compile them with -dynamic, like so:

 $ ghc -O2 --make -dynamic A.hs

Also, any libraries you want to be shared should be built with --enabled-shared:

 $ cabal install opengl --enable-shared --reinstall     
 $ cabal install glfw   --enable-shared --reinstall

And you'll end up with a much smaller executable, that has both C and Haskell dependencies dynamically resolved.

$ ghc -O2 -dynamic A.hs                         
[1 of 4] Compiling S3DM.V3          ( S3DM/V3.hs, S3DM/V3.o )
[2 of 4] Compiling S3DM.M3          ( S3DM/M3.hs, S3DM/M3.o )
[3 of 4] Compiling S3DM.X4          ( S3DM/X4.hs, S3DM/X4.o )
[4 of 4] Compiling Main             ( A.hs, A.o )
Linking A...

And, voilà!

$ du -hs A
124K    A

which you can strip to make even smaller:

$ strip A
$ du -hs A
84K A

An eensy weensy executable, built up from many dynamically linked C and Haskell pieces:

$ ldd A
    libHSOpenGL-2.4.0.1-ghc7.0.3.so => ...
    libHSTensor-1.0.0.1-ghc7.0.3.so => ...
    libHSStateVar-1.0.0.0-ghc7.0.3.so =>...
    libHSObjectName-1.0.0.0-ghc7.0.3.so => ...
    libHSGLURaw-1.1.0.0-ghc7.0.3.so => ...
    libHSOpenGLRaw-1.1.0.1-ghc7.0.3.so => ...
    libHSbase-4.3.1.0-ghc7.0.3.so => ...
    libHSinteger-gmp-0.2.0.3-ghc7.0.3.so => ...
    libHSghc-prim-0.2.0.0-ghc7.0.3.so => ...
    libHSrts-ghc7.0.3.so => ...
    libm.so.6 => /lib/libm.so.6 (0x00007ffa4ffd6000)
    librt.so.1 => /lib/librt.so.1 (0x00007ffa4fdce000)
    libdl.so.2 => /lib/libdl.so.2 (0x00007ffa4fbca000)
    libHSffi-ghc7.0.3.so => ...

One final point: even on systems with static linking only, you can use -split-objs, to get one .o file per top level function, which can further reduce the size of statically linked libraries. It needs GHC to be built with -split-objs on, which some systems forget to do.

Linger answered 24/5, 2011 at 19:20 Comment(10)
when is dynamic linking due to arrive for ghc on the mac?Kulun
...doesn't cabal install strip the installed binary by default?East
doing so on Windows seems to make the resulting file un-runnable, it complains about missing libHSrts-ghc7.0.3.dllArticulator
apparently the dlls for the shared libraries are kept within original package folders of these libraries. Is there any way to force ghc to store these libraries in a certain location such as "C:\Windows\System32 for example ?Articulator
@East Good question! And how can we do the same with cabal, without ghc (for example Snap projects are built with cabal install)?Sibyls
I keep getting the Could not find module `Prelude' Perhaps you haven't installed the "dyn" libraries for package `base'? error message...Sibyls
@Articulator To get this to work on Windows, the DLL files must be in your path, so either you can copy all the DLLs to an existing PATH location, or set PATH to point to all the locations (there are a slew of DLLs in a lot of folders). I think it's a lot easier on Unix as the shared object files are installed into standard locations.Waterproof
will this binary be working on other Linux machines after these procedures?Unabridged
Hi OP from 2011! I'm from the future and can tell that pandoc executable on Ubuntu 16.04 is 50MB fat and it's not going to changed based on packages.ubuntu.com/zesty/pandoc . Message to near-future self and others: contact package maintainer and ask if enable-shared was considered. launchpad.net/ubuntu/+source/pandoc/+bugsCherellecheremis
For those on macs, the equivalent of ldd appears to be otool -LDenbighshire
F
13

Haskell uses static linking by default. This is, the whole bindings to OpenGL are copied into your program. As they are quite big, your program gets unnecessarily inflated. You can work around this by using dynamic linking, although it isn't enabled by default.

Flashover answered 24/5, 2011 at 19:9 Comment(2)
You can dynamically link libraries to work around this. Not sure why it matters what is default, the flag is simple enough.Cinchonism
The problem is that "any libraries you want to be shared should be built with --enabled-shared" so if your Haskell Platform comes with libraries built without --enabled shared you have to recompile the base libraries which can be quite painful.Ontina

© 2022 - 2024 — McMap. All rights reserved.