"variable tracking" is eating my compile time!

Asked 2/6, 2010 at 1:35 Answered 7/4, 2023 at 17:54

I have an auto-generated file which looks something like this...

static void do_SomeFunc1(void* parameter)
{
    // Do stuff.
}

// Continues on for another 4000 functions...

void dispatch(int id, void* parameter)
{
    switch(id)
    {
        case ::SomeClass1::id: return do_SomeFunc1(parameter);
        case ::SomeClass2::id: return do_SomeFunc2(parameter);
        // This continues for the next 4000 cases...
    }
}

When I build it like this, the build time is enormous. If I inline all the functions automagically into their respective cases using my script, the build time is cut in half. GCC 4.5.0 says ~50% of the build time is being taken up by "variable tracking" when I use -ftime-report. What does this mean and how can I speed compilation while still maintaining the superior cache locality of pulling out the functions from the switch?

EDIT: Interestingly enough, the build time has exploded only on debug builds, as per the following profiling information of the whole project (which isn't just the file in question, but still a good metric; the file in question takes the most time to build):

Debug: 8 minutes 50 seconds
Release: 4 minutes, 25 seconds

If you're curious, here are a few sample do_func's, context removed. As you can see, I simplified the problem definition a bit to only show the relevant parts. In case you're wondering, all the self->func calls are calls to boost::signal's.

static void do_Match_Login(Registry* self, const uint8_t* parameters, uint16_t length)
{
    const uint8_t* paramPtr = parameters;

    std::string p0 = extract_string(parameters, &paramPtr, length);
    std::string p1 = extract_string(parameters, &paramPtr, length);
    int32_t p2 = extract_int32(parameters, &paramPtr, length);
    uint32_t p3 = extract_uint32(parameters, &paramPtr, length);
    tuple<Buffer, size_t, size_t> p4 = extract_blob(parameters, &paramPtr, length);

    return self->Match_Login(p0, p1, p2, p3, p4);
}

static void do_Match_ResponseLogin(Registry* self, const uint8_t* parameters, uint16_t length)
{
    const uint8_t* paramPtr = parameters;

    int32_t p0 = extract_int32(parameters, &paramPtr, length);
    std::string p1 = extract_string(parameters, &paramPtr, length);
    array<uint16_t, 3> p2 = extract_vector(parameters, &paramPtr, length);
    std::string p3 = extract_string(parameters, &paramPtr, length);
    uint8_t p4 = extract_uint8(parameters, &paramPtr, length);
    uint8_t p5 = extract_uint8(parameters, &paramPtr, length);
    uint64_t p6 = extract_MUID(parameters, &paramPtr, length);
    bool p7 = extract_bool(parameters, &paramPtr, length);
    tuple<Buffer, size_t, size_t> p8 = extract_blob(parameters, &paramPtr, length);

    return self->Match_ResponseLogin(p0, p1, p2, p3, p4, p5, p6, p7, p8);
}

Meson answered 2/6, 2010 at 1:35 Comment(20)

Looks like a case for Replace Conditional with Polymorphism... – Hartshorn 2/6, 2010 at 1:40

Each do_SomeFuncN is literally 1 < x < 6 lines of code. It's not worth it, especially when this file is auto-generated. – Meson 2/6, 2010 at 1:41

Since when is a short function an excuse for a difficult to maintain design? It might not have to be auto-generated if you didn't need such an insane switch. – Hartshorn 2/6, 2010 at 1:42

It's not difficult to maintain. The problem isn't as simple as I posed it here. The script parses a networking protocol definition, and the dispatch function pipes the extracted parameters into the correct boost::signal. Of course, there are multiple parameter types that need to be handled and verified, so it has to do checking there and make sure it doesn't segfault. – Meson 2/6, 2010 at 1:43

@wowus: Then move it into it's own file so that it doesn't have to be compiled all the time. – Hartshorn 2/6, 2010 at 1:44

Dunno if it'll help, but consider an id->func lookup table. – Tahiti 2/6, 2010 at 1:44

@ONeal: I did. That file has a 10 minute build time right now. – Meson 2/6, 2010 at 1:45

@Stephen: Won't that have a big-ish impact on stack usage? – Meson 2/6, 2010 at 1:45

I don't see how. Don't create the table on the stack, define it once (static or heap) and use id to lookup and get func then invoke it. Same number of stack frames as now, much less switch branching. – Tahiti 2/6, 2010 at 1:49

Did I mention IDs aren't necessarily starting at zero and are sparse? No? Damn. Well, yeah. It might as well be random numbers. Good idea though, if I ever have a chance to make a new protocol/modify this existing one, I'll do that for sure. – Meson 2/6, 2010 at 1:52

@wowus can you show us a couple of do_someFunc() s? – Rhodolite 2/6, 2010 at 2:11

I don't see what id being non-consecutive has to do with not being able to put it into a lookup table. That's effectively what the compiler will be doing anyway: generating a jump table from the list of switch cases. – Figure 2/6, 2010 at 2:15

The array will have too many holes to make it memory-conscious. According to my calculations, it will take up about 3MB of space - which would overflow the Windows stack. – Meson 2/6, 2010 at 2:17

@wowus: It's not an array, it'll be a lookup table: hash_map<> or something. As I said, that's basically what the compiler will turn your enormous switch into anyway. – Figure 2/6, 2010 at 2:21

@codeka: It won't - it turns it into a BST according to my disassembly. – Meson 2/6, 2010 at 2:22

@wowus: then std::map<> is the equivalent. Anyway, I'm not even sure manually doing it yourself over letting the compiler do it would even help :) – Figure 2/6, 2010 at 2:25

@codeka: It's definately not just like an std::map. Just looking at it in IDA makes you understand the huge difference. For example, it's perfectly balanced at compile-time, immutable, and consists of only jumps. Actually, it's really damn cool. I suggest you check it out some time! – Meson 2/6, 2010 at 2:28

Why are all your void functions returning values? – Pacificas 2/6, 2010 at 6:59

If you submitted this gcc.gnu.org/bugzilla , there is a good chance it will become faster in the next GCC version. – Fail 2/6, 2010 at 8:13

Also see GCC/Make Build Time Optimizations. – Wildermuth 28/5, 2017 at 21:1

You can turn off variable tracking. Variable tracking is used to make the debug information a bit more valuable, but if this code is auto-generated and you're not really going to be debugging it much then it's not really useful. You can just turn it off for that file only.

gcc -fno-var-tracking ...

Should do the trick. As I said, I think you can just do it for that file.

Figure answered 2/6, 2010 at 2:19 Comment(4)

Bleh, going to have to dive into CMake docs to figure out how - but that's exactly what I'm looking for; thanks! – Meson 2/6, 2010 at 2:21

SET_SOURCE_FILE_PROPERTIES(fileName.cpp COMPILE_FLAGS -fno-var-tracking) – Tubate 2/6, 2010 at 5:41

I think it should be SET_SOURCE_FILES_PROPERTIES. At least the extra "S" was needed for me to get CMake to understand. – Lightless 18/1, 2018 at 11:57

You also need to add PROPERTIES, to get something like this: SET_SOURCE_FILES_PROPERTIES(fileName.cpp PROPERTIES COMPILE_FLAGS -fno-var-tracking) – Lightless 18/1, 2018 at 12:20

In GNU Make, you can turn off variable tracking for a single target if your compile command uses a flags variable in the arguments like

fileName.o: CXXFLAGS += -fno-var-tracking

Epicureanism answered 5/11, 2014 at 16:55 Comment(2)

Does this work stand alone (as shown above), or do you need the full recipe, too (i.e., $CXX $CXXFLAGS ... -c $<)? – Wildermuth 28/5, 2017 at 20:57

See Target-specific Variable Values in the Gnu Make Manual. gnu.org/software/make/manual/make.html#Target_002dspecific – Epicureanism 30/5, 2017 at 16:12

Besides the answers telling how to turn off -fvar-tracking at the CMake level and at the g++-command-line level, you can also turn it off per file, by placing this line at the top of the source file:

#pragma GCC optimize("no-var-tracking")

Then, to suppress the bogus warning that Clang gives on that line, you might want to surround it with #pragma GCC diagnostic ignored, like this:

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunknown-pragmas"
#pragma GCC optimize("no-var-tracking") // to speed up compilation
#pragma GCC diagnostic pop

Loganloganberry answered 7/4, 2023 at 17:54 Comment(0)

Recommended topics

Hot tags