How to reduce compile time for large C++ library of individual .cpp files?
Asked Answered
F

4

0

We're developing a C++ library with currently over 500 hundred individual .cpp files. These are each compiled and archived into a static library. Even with a parallel build, this takes some minutes. I'd like to reduce this compilation time.

Each file is on average 110 lines with a function or two inside. However, for each .cpp file there is a corresponding .h header and these are often included by many of the .cpp files. For example, A.h might be included by A.cpp, B.cpp, C.cpp, and so on.

I'd first like to profile the compilation process. Is there a way to find out how much time is spent doing what? I'm worried that a lot of time is wasted opening header files only to check the include guards and ignore the file.

If that sort of thing is the culprit, what are best practices for reducing compilation time?

I'm willing to add new grouping headers, but probably not willing to change this many-file layout since this allows our library to also function as an as-needed header-only library.

Forklift answered 30/7, 2015 at 20:36 Comment(5)
See this thread #13560318Precarious
I'm not sure that opening files is actually what takes time. Usually, compilation times can be reduced by including less useless heavy headers (containing many inline functions/template metaprogramming constructions) and by reducing coupling between headers. But if your headers only contains forward declarations, maybe that's just the 500x100x110 lines of code (according to your numbers). C++ compiles slowly after all, just make sure each recompilation doesn't recompile something which doesn't depend on updated files. Well, i guess you'll see when you'll profile it.Balcke
You may find the tup build system interesting, its pretty fast and avoids redundancies. Some tests: gittup.org/tup/make_vs_tup.htmlHarrier
Are you doing a full rebuild every time, or only rebuilding the files that need to be rebuilt? Can you reorganize the code in such a way that fewer files need to be rebuilt in response to most code changes?Lyingin
@JeremyFriesner, using cmake when developing so I'm only rebuilding what must. I'm more annoyed when doing a fresh build, for example, when doing nightly compilation checks.Forklift
R
4

It's REALLY hard to say.

I worked on improving the compile time on our project at work, and found that ONE file took 15 minutes (when compiling in -O2, but about 15 seconds in -O0) and gets compiled twice, so for a total compile time of about 60-70 minutes, this was roughly half the time. Turning off ONE optimisation feature brought that one file down to about 20 seconds instead of 15 minutes... This file was producing one function that was machine-generated and a few tens of thousands of lines long, which cause the compiler to do some magic long stuff (presumably some O(N^2) algorithm).

This can also happen if you have a small function that then calls lots of small functions in turn, that eventually, through layers of inlining, turns into a large file.

At other times, I've found that reducing the number of files and putting more code in one file works better.

In general, my experience (both with my own compiler project, and other people's/company's compilers) is that it's NOT the parsing and reading of files that take the time, but the various optimisation and code-generation passes. You can try that out by compiling all files using -fsyntax-only or whatever it is called for your compiler. That will JUST read the source and check that it's syntactically correct. Try also compiling with -O0 if you aren't already. Often a specific optimisation pass is the problem, and some passes are worse than others, so it's useful to check what individual optimisation passes there are in a particular -O option - in gcc that can be listed with -Q -O2 --help=optimizers [in this case for -O2].

You really need to figure out what the compiler is spending time on. It's no point in changing the code around if the problem is that you are spending most of the time optimising the code. It's no point in cutting down optimisers if the time is spent in parsing, and optimisation adds no extra time. Without actually building YOUR project, it's very hard to say for sure.

Another tip is to check top to see if your compile processes uses 100% cpu each - if not, you're probably not having enough memory in your compile machine. I have a build option for my work project which "kills" my desktop machine by running so much out of memory the whole system just grinds to a halt - even switching from one tab to another in the web-browser takes 15-30 seconds. The only solution is to run less -j [but of course, I usually forget, and at that point - so if I don't want to interrupt it, I go for lunch, long coffee break or some such until it finishes, because the machine is just unusuable]. This is for debug builds only, because putting together the debug info for the large codebase takes up a lot of memory [apparently!]

Rolanderolando answered 30/7, 2015 at 21:10 Comment(0)
S
2

If that sort of thing is the culprit, what are best practices for reducing compilation time?

If your pre-processor supports the #pragma once directive, use it. That will make sure that a .h file is not read more than once.

If not, use #include guards in .cpp files.

Say you have

A.h:

#ifndef A_H
#define A_H

...

#endif

You can use the following method in A.cpp:

#ifndef A_H
#include "A.h"
#endif

You would need to repeat that pattern for every .h file. E.g.

#ifndef B_H
#include "B.h"
#endif

#ifndef C_H
#include "C.h"
#endif

You can read more about use of #include guards in .cpp file at What is the function of include guard in .cpp (not in .h)?.

Scopoline answered 30/7, 2015 at 20:45 Comment(3)
Thanks for the link for the reasoning behind the guards in the .cpp files. It does look redundant at first.Tipperary
@Alain, it is very redundant. You go through these pains only when you are sure that they are worth the price.Scopoline
Modern compilers already optimize #include so that they don't even open the same header more than once: gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html These tricks just make the codebase harder to maintain.Harrier
O
2

I don't know if you do that already, but using forward declarations instead of includes in headers should increase compilation speed. See this question for more info:

Should one use forward declarations instead of includes wherever possible?

Another way to decrease compilation time is using ccache. It caches results of previous compilations.

https://ccache.samba.org

Onstad answered 30/7, 2015 at 20:59 Comment(0)
E
0

Structure your code to used PIMPL paradigm. The 2 primary benefits are:

  • You can hide away all implementation (member vars etc) from a user
  • If you change your implementation files then "generally" only this area requires recompilation rather than a full rebuild.

For a good overview see here

Enjambment answered 29/8, 2019 at 1:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.