Tool to parse C++ source and move in-header inline methods to the .cpp source file? [closed]
Asked Answered
B

4

13

The source code of our application is hundreds of thousands of line, thousands of files, and in places very old - the app was first written in 1995 or 1996. Over the past few years my team has greatly improved the quality of the source, but one issue remains that particularly bugs me: a lot of classes have a lot of methods fully defined in their header file.

I have no problem with methods declared inline in a header in some cases - a struct's constructor, a simple method where inlining measurably makes it faster (we have some math functions like this), etc. But the liberal use of inlined methods for no apparent reason is:

  • Messy
  • Makes it hard to find the implementation of a method (especially searching through a tree of classes for a virtual function, only to find one class had its version declared in the header...)
  • Probably increases the compiled code size
  • Probably causes issues for our linker, which is notoriously flaky for large codebases. To be fair, it has got much better in the past few years, but it's not perfect.

That last reason may now be causing problems for us and it's a good reason to go through the codebase and move most definitions to the source file.

Our codebase is huge. Is there an automated tool that can do (most of) this for us?

Notes:

  • We use Embarcadero RAD Studio 2010. In other words, the dialect of C++ includes VCL and other extensions, etc.
  • A few headers are standalone, but most are paired with a corresponding .cpp file, as you normally would. Apart from the extension the filename is the same, i.e., if there are methods defined in X.h, they can be moved to X.cpp. This also means the tool doesn't have to handle parsing the whole project - it could probably just parse individual pairs of .cpp/.h files, ignore the includes, etc, so long as it could reliably recognise a method with a body defined in a class declaration and move it.
Beaner answered 13/1, 2012 at 15:0 Comment(12)
#6363495 duplicate?Mccrory
Get an intern to do it.Paramatta
@Zuljin: hmm, possibly... But there are no applicable answers! (VS addins, with no indication of bulk changes on their websites, and the highest-voted thing out of all comments and answers is something saying 'Oh, you can easily create a script...' I don't think 'easy' and 'parse C++' normally go together!)Beaner
@David: This part could actually be easy, since you don't need to parse complex C++ structures, but only need to find class/struct definitions, enter them, and check the functions (word name(*){ should be a good start for a regex). This should be considerably easier than full C++ parsing, just a little automaton.Her
You think code from 1995 is old. That's cute :-)Bernat
This actually sounds like a feasible side project to implement directly. You could write the program to go through and move those functions (provided your files all follow the naming conventions). It sounds like a fun little program.Kept
@TheBuzzSaw: It will not be fun to make that idea work without very strong C++ manipulation technology. It will be hard to do even with good technology.Dionedionis
@IraBaxter Nah, you're blowing it out of proportion. If all you want is to move inline functions from a header to a cpp file, you just need a few string manipulation basics. You don't need to lug in the entire LLVM suite.Kept
@TheBuzzSaw: I've built tools that really transform big C++ systems reliably. They're hard. If you think you can do this reliably with just string hacking e.g., a Perl script), be my guest. If you don't care if the answer is right and you are willing to debug the errors by hand, you can likely make something limp. Given the thousands of files (compilations units?) that OP has, I don't envy the amount of time you are going spend fixing compilation errors.Dionedionis
@DavidM: Three years elapsed since you asked. Out of curiousity, what did you actually end up doing?Dionedionis
@IraBaxter Sadly, nothing. No suggestions here were workable, and my then boss was not interested in pursuing anything unless it was easy and quick to implement (ie download a tool, run it, done.) I've left the company and don't know the current state of the codebase, but I hope they've done something - manual refactoring if nothing else. In light of the lack of tools shown here, it's what I would have chosen had I had to go-ahead to do anything: to choose the top five classes that had the worst of what was described above and manually fix them. Drudge work but worth it.Beaner
@DavidM: I'm not suprised. I keep collecting war stories about awful messes that people cannot clean up by hand. I love guys like TheBuzzSaw: "you just need a few string manipulation basics". You had the right question: are there automated tools that can do this. Answer is yes, but they're presently not easy to use. Anyway, I see you found your way out of the mess :-}Dionedionis
M
7

You might try Lazy C++. I have not used it, but I believe it is a command line tool to do just what you want.

Maladjusted answered 13/1, 2012 at 16:4 Comment(3)
+1: do not know about this tool :)Outsmart
+1, never heard of that before. A quick browse makes it look like a very useful tool for this sort of thing!Beaner
I contacted the author after reading the FAQ. The tool requires writing code in a .lzz file, which Lazy C++ parses and splits to CPP and H files. I think combining our code into single files only to have a tool split them again might be just as much work... :) Thanks for the suggestion though, it's an interesting tool.Beaner
B
2

If the code is working then I would vote against any major automated rewrite.
Lots of work could be involved fixing it up.

Small iterative improvements over time is a better technique as you will be able to test each change in isolation (and add unit tests). Anyway your major complaint about not being able to find the code is not a real problem and is already solved. There are already tools that will index your code base so your editor will jump to the correct function definition without you having to search for it. Take a look at ctags or the equivalent for your editor.

  • Messy

    Subjective

  • Makes it hard to find the implementation of a method (especially searching through a tree of classes for a virtual function, only to find one class had its version declared in the header...)

    There are already tools available for finding the function. ctags will make a file that allows you to jump directly to the function from any decent editor (vim/emacs). I am sure your editor if nto one of these has the equivalent tool.

  • Probably increases the compiled code size

    Unlikely. The compiler will choose to inline or not based on internal metrics not weather it is marked inline in the source.

  • Probably causes issues for our linker, which is notoriously flaky for large codebases. To be fair, it has got much better in the past few years, but it's not perfect.

    Unlikely. If your linker is flakey then it is flakey it is not going to make much difference where the functions are defined as this has no bearing on if they are inlined anyway.

Bernat answered 13/1, 2012 at 17:2 Comment(2)
+1 for "Lots of work could be involved fixing it up." I would even say "Lots of work will be involved fixing it up."Indite
Thanks Loki. I don't want to do a 'rewrite' - I want to keep the logical code identical, simply move its location. I know it's not quite as simple as that, but... :) I anticipate perhaps running a tool and manually reviewing its changes, and I know the codebase very well in spite of its size, so I feel pretty confident about that side of it. Re some of the other points, this SO question may interest you.Beaner
D
1

You have a number of problems to solve:

  • How to regroup the source and header files ideally
  • How to automate the code modifications to carry this out

In both cases, you need a robust C++ parser with full name resolution to determine the dependencies accurately.

Then you need machinery that can reliably modify the C++ source code.

Our DMS Software Reengineering Toolkit with its C++ Front End could be used for this. DMS has been used for large-scale C++ code restructuring; see http://www.semdesigns.com/Company/Publications/ and track down the first paper "Case Study: Re-engineering C++ Component Models Via Automatic Program Transformation". (There's an older version of this paper you can download from there, but the published one is better). AFAIK, DMS is the only tool to have ever been applied to transforming C++ on large scale.

This SO discussion on reorganizing code addresses the problem of grouping directly.

Dionedionis answered 13/1, 2012 at 20:43 Comment(0)
I
1

XE2 includes a new static analyzer. It might be worthwhile to give the new version of C++Builer's trial a spin.

Ivanaivanah answered 14/1, 2012 at 16:56 Comment(1)
Thanks David! I actually have a copy of C++Builder XE2 Pro, so I'll try it with that. (Even though work has stuck with 2010 - we're waiting for 64-bit to upgrade again - I try to stay up to date, thus my own copy.) I've only tried the refactorings available in 2010 before, and they weren't very reliable.Beaner

© 2022 - 2024 — McMap. All rights reserved.