Tools to find included headers which are unused? [closed]
Asked Answered
D

9

77

I know PC-Lint can tell you about headers which are included but not used. Are there any other tools that can do this, preferably on linux?

We have a large codebase that through the last 15 years has seen plenty of functionality move around, but rarely do the leftover #include directives get removed when functionality moves from one implementation file to another, leaving us with a pretty good mess by this point. I can obviously do the painstaking thing of removing all the #include directives and letting the compiler tell me which ones to reinclude, but I'd rather solve the problem in reverse - find the unused ones - rather than rebuilding a list of used ones.

Dethrone answered 19/8, 2009 at 18:38 Comment(7)
It is notoriously hard to find something that isn't there.Whity
This is a problem I've hit before, and not yet found a 100% reliable automated solution - I'm interested to see what answers we get.Engeddi
@Neil: In general that's true, but in this specific case it's not that hard (in the abstract). You "merely" identify all the symbols in the file, match them against the headers that satisfy them, and then prune out the headers that weren't used in that process. Of course, in reality it's complicated because you need a C/C++ parser and the definition of "required" is looser than you would want to make this process "easy".Dethrone
@Nick and then you have headers which are used only on a platform or when compiling in some configuration, you have headers which provides all their symbols by including private headers which client code shouldn't include directly, you have headers which include another to be self-sufficient but you don't use the interface for which that other include is needed, ...Chamorro
@AProgrammer: Only being used on one platform is relatively easy to resolve - an analysis tool is going to preprocess them right out anyhow (which should also happen in your "some configuration" case). I'm not looking for headers that are listed in the file but properly preprocessed out - I'm looking for headers which include completely unnecessary source in the finished object code. Also, as for private headers, that's fine - they'll still be "used" in most cases (or they were unnecessary - a useful thing to know).Dethrone
possible duplicate of How should I detect unnecessary #include files in a large C++ project?Prickett
possible duplicate of C/C++: Detecting superfluous #includes?Frechette
M
31

DISCLAIMER: My day job is working for a company that develops static analysis tools.

I would be surprised if most (if not all) static analysis tools did not have some form of header usage check. You could use this wikipedia page to get a list of available tools and then email the companies to ask them.

Some points you might consider when you're evaluating a tool:

For function overloads, you want all headers containing overloads to be visible, not just the header that contains the function that was selected by overload resolution:

// f1.h
void foo (char);

// f2.h
void foo (int);


// bar.cc
#include "f1.h"
#include "f2.h"

int main ()
{
  foo (0);  // Calls 'foo(int)' but all functions were in overload set
}

If you take the brute force approach, first remove all headers and then re-add them until it compiles, if 'f1.h' is added first then the code will compile but the semantics of the program have been changed.

A similar rule applies when you have partial and specializations. It doesn't matter if the specialization is selected or not, you need to make sure that all specializations are visible:

// f1.h
template <typename T>
void foo (T);

// f2.h
template <>
void foo (int);

// bar.cc
#include "f1.h"
#include "f2.h"


int main ()
{
  foo (0);  // Calls specialization 'foo<int>(int)'
}

As for the overload example, the brute force approach may result in a program which still compiles but has different behaviour.

Another related type of analysis that you can look out for is checking if types can be forward declared. Consider the following:

// A.h
class A { };

// foo.h
#include "A.h"
void foo (A const &);

// bar.cc
#include "foo.h"

void bar (A const & a)
{
  foo (a);
}

In the above example, the definition of 'A' is not required, and so the header file 'foo.h' can be changed so that it has a forward declaration only for 'A':

// foo.h
class A;
void foo (A const &);

This kind of check also reduces header dependencies.

Majestic answered 20/8, 2009 at 9:23 Comment(3)
Most that I have looked at do not have a header usage check of this nature. You make a very good point about overloads and specializations, but thankfully our conventions are such that these would basically never be in different headers.Dethrone
Also, I've been down the road with that wikipedia page. The C/C++ section is very weak...I suppose I should go down the list of commercial providers and see which ones support C++. Also, I'm perfectly fine with people suggesting their own product - it's more than I had to go on before, and your advice in general is very informative.Dethrone
"For function overloads, you want all headers containing overloads to be visible, not just the header that contains the function that was selected by overload resolution: ..." +1, that's potential nightmare to debug and a big reason I'm afraid of doing this myselfWylma
T
23

Here's a script that does it:

#!/bin/bash
# prune include files one at a time, recompile, and put them back if it doesn't compile
# arguments are list of files to check
removeinclude() {
    file=$1
    header=$2
    perl -i -p -e 's+([ \t]*#include[ \t][ \t]*[\"\<]'$2'[\"\>])+//REMOVEINCLUDE $1+' $1
}
replaceinclude() {
   file=$1
   perl -i -p -e 's+//REMOVEINCLUDE ++' $1
}

for file in $*
do
    includes=`grep "^[ \t]*#include" $file | awk '{print $2;}' | sed 's/[\"\<\>]//g'`
    echo $includes
    for i in $includes
    do
        touch $file # just to be sure it recompiles
        removeinclude $file $i
        if make -j10 >/dev/null  2>&1;
        then
            grep -v REMOVEINCLUDE $file > tmp && mv tmp $file
            echo removed $i from $file
        else
            replaceinclude $file
            echo $i was needed in $file
        fi
    done
done
Trinidad answered 21/8, 2011 at 0:13 Comment(4)
I've used the same method myself, if your using GCC be sure to compile with -Werror=missing-prototypes otherwise you can remove headers to functions defined in the source file which can cause problems later (you wont notice if the header gets out of sync).Beacham
nice! probably will not scale well on a big project but exactly what i needed for a small project! thanks! However.. i think that it should be first be run on all.h and then on all cpp...Necrotomy
This is great. I wonder if this brute force solution could be extended to handle the cases in the comments of https://mcmap.net/q/93678/-tools-to-find-included-headers-which-are-unused-closed above.Feinstein
That does not work ... consider you have a header wich consists of 2 #includes. Your code only needs info from one of them, so you could replace "funcs.h" with "func1.h" and remove the need for "func2.h"Facetiae
C
5

Have a look at Dehydra.

From the website:

Dehydra is a lightweight, scriptable, general purpose static analysis tool capable of application-specific analyses of C++ code. In the simplest sense, Dehydra can be thought of as a semantic grep tool.

It should be possible to come up with a script that checks for unused #include files.

Cuba answered 23/2, 2010 at 20:52 Comment(2)
Development was abandoned in 2010.Rameau
Development changed to DXR.Mccurry
V
5

Google's cppclean seems to do a decent job of finding unused header files. I just started using it. It produces a few false positives. It will often find unnecessary includes in header files, but what it will not tell you is that you need a forward declaration of the associated class, and the include needs to be moved to the associated source file.

Vinculum answered 18/8, 2011 at 17:38 Comment(8)
cppclean cleans too much. If I have a header file foo.h that explicitly uses functionality/types defined in bar.h and in baz.h I would expect to see foo.h have a #include "bar.h" and a #include "baz.h". Suppose that bar.h also happens to #include "baz.h". This does not mean that I can get rid of the #include "baz.h" in foo.h. Most unneeded header file checks will say that I should get rid of it. This is a false positive, which are almost as bad as false negatives. (And maybe worse; too many false positives and I'll stop using the tool. Lint is a good example.)Minetta
@David, I agree that it produces many false positives, but I feel that it is better than manually examining each file, and false positives are quickly spotted and remedied. Do you have something you use that works pretty well?Vinculum
Today my file my be able to get by without #include "baz.h". Tomorrow, maybe not. Suppose bar.h is your responsibility and you too are in the process of removing unneeded headers. You bar.h doesn't need baz.h, so you delete the extraneous #include "baz.h" from bar.h. You have just broken any code that piggybacked on that extraneous #include. The solution is not to rely on such piggybacks. If your file uses some functionality defined elsewhere, #include the file that defines that functionality. Don't let some other header do that #include for you.Minetta
@DavidHammen, if I don't want to rely on include piggybacks, then isn't a tool like cppclean the way to do it? Yes, if I remove some header that's unnecessary, it'll break a lot of stuff, but really, those files shouldn't have included the piggyback in the first place. So, I go and fix those files that relied on that include. If I truly want better dependency management, that's what I should be doing anyway. I'm confused, because you say piggybacks are bad, but then you say not to use a tool that forces me to eliminate them.Vinculum
I don't think you understand the problem. Automated tools sometimes miss the obvious, sometimes go too far. They give false positives and false negatives. You can use such tools, but always take them with a grain of salt.Minetta
This project seems to have disappeared (no source here: code.google.com/p/cppclean/source/checkout) and that seems to be a known issue: code.google.com/p/cppclean/issues/detail?id=3#makechangesAlexandrite
As David said, its a known issue but still available via svn: svn checkout http://cppclean.googlecode.com/svn/trunk/ cppclean-read-onlyRameau
As of today, the SVN does not respond anymore either.Herbartian
C
3

If you are using Eclipse CDT you can try Includator which is free for beta testers (at the time of this writing) and automatically removes superfluous #includes or adds missing ones.

Disclaimer: I work for the company that develops Includator and have been using it for the past few months. It works quite well for me, so give it a try :-)

Cline answered 1/6, 2011 at 13:36 Comment(0)
B
1

As far as I know, there isn't one (that isn't PC-Lint), which is a shame, and surprising. I've seen the suggestion to do this bit of pseudocode (which is basically automating your "painstaking process":

for every cpp file
for every header include
comment out the include
compile the cpp file
if( compile_errors )
un-comment out the header
else
remove header include from cpp

Put that in a nightly cron, and it should do the job, keeping the projcet in question free of unused headers (you can always run it manually, obviously, but it'll take a long time to execute). Only problem is when not including a header doesn't generate an error, but still produces code.

Bonnell answered 19/8, 2009 at 18:56 Comment(3)
That still unfortunately doesn't clean up headers that include other headers that aren't required (and worse, may cause some "programming by coincidence" in other implementation files that get the headers they need through other headers that actually don't need them). It at least reduces the number of spurious includes in cpp files, but I would like to eliminate them in other headers as well.Dethrone
It's also unadvisable to remove every header. Consider #include <vector> and #include <algorithm>. In some implementations of vector algorithm will be included, but that isn't guaranteed. Robust code should include both (if their both used). Your described method could remove #include <algorithm> depending on the implementation of vector.Tray
This is true. Nick, are you more concerned with local header files (or do you at least have a lot of them)? If so, you could modify the above algorithm to not mess with library headers, and tune those manually. It's a pain, but it would cut the work down, at least.Bonnell
C
1

I've done this manually and its worth it in the short (Oh, is it the long term? - It takes a long time) term due to reduced compile time:

  1. Less headers to parse for each cpp file.
  2. Less dependencies - the whole world doesn't need re-compiling after a change to one header.

Its also a recursive process - each header file that stays in needs examining to see if any header files it includes can be removed. Plus sometimes you can substitute forward declarations for header includes.

Then the whole process needs repeating every few months/year to keep on top of leftover headers.

Actually, I'm a bit annoyed with C++ compilers, they should be able to tell you what's not needed - the Microsoft compiler can tell you when a change to a header file can be safely ignored during compilation.

Charmine answered 19/8, 2009 at 19:43 Comment(0)
S
0

If someone is interested, I just putted on sourceforge a small Java comand-line tool for doing exactly that. As it is written in Java, it is obviously runable on linux.

The link for the project is https://sourceforge.net/projects/chksem/files/chksem-1.0/

Scabious answered 16/8, 2013 at 7:41 Comment(0)
O
-1

Most approaches for removing unused includes work better if you first make sure that each your header files compiles on its own. I did this relatively quickly as follows (apologies for typos -- I am typing this at home:

find . -name '*.h' -exec makeIncluder.sh {} \;

where makeIncluder.sh contains:

#!/bin/sh
echo "#include \"$1\"" > $1.cpp

For each file ./subdir/classname.h, this approach creates a file called ./subdir/classname.h.cpp containing the line

#include "./subdir/classname.h"

If your makefile in the . directory compiles all cpp files and contains -I. , then just recompiling will test that every include file can compile on its own. Compile in your favorite IDE with goto-error, and fix the errors.

When you're done, find . -name '*.h.cpp' -exec rm {} \;

Ortega answered 19/9, 2013 at 2:17 Comment(1)
I fail to see how this helps. Even if some of my headers didn't compile on their own (which isn't the case anyhow), this wouldn't provide any additional insight into the unused ones.Dethrone

© 2022 - 2024 — McMap. All rights reserved.