How can I get the compiler to warn me about extremely long symbol names?
Asked Answered
M

0

9

The following code compiles fine and generates a compile-time array consisting of not too many elements, even though the original make_sequence function generated a large sequence. In this case, the generated object file is also reasonable and no symbol name is generated for consume_sequence1, because it is consteval.

Note that because sizeof...(I) is used as a template parameter in the body, consume_sequence1 cannot take the sequence as a function parameter. Instead, it must take it as a NTTP. Also, the production and consumption steps need to be in separate functions because the sequence generated by make_sequence can be used in several (consteval) consumption functions.

#include <array>
#include <utility>
#include <algorithm>

template<auto I>
consteval auto make_sequence(){
    // Complex computation 1
    return std::make_index_sequence<I>{};
}

template<auto seq>
consteval auto consume_sequence1(){
    return []<auto... I>(std::index_sequence<I...>){
        std::array src{I...};
        std::array<typename decltype(src)::value_type, sizeof...(I) / 1000> subseq;
        std::ranges::copy_if(src, subseq.begin(), [](auto v){
            //Complex computation 2
            return v%1000 == 0;
        });
        return subseq;
    }(seq);
}

template<auto I> 
auto calculation1(){
    return consume_sequence1<make_sequence<I>()>();
}

int main(){
    calculation1<10000>();
}

Now let's assume that one (likely accidentally) forgets to add consteval before the consume function, resulting in the following situation.

#include <array>
#include <utility>
#include <algorithm>

template<auto I>
consteval auto make_sequence(){
    // Complex computation 1
    return std::make_index_sequence<I>{};
}

template<auto seq>
auto consume_sequence2(){
    return []<auto... I>(std::index_sequence<I...>){
        std::array src{I...};
        std::array<typename decltype(src)::value_type, sizeof...(I) / 1000> subseq;
        std::ranges::copy_if(src, subseq.begin(), [](auto v){
            //Complex computation 2
            return v%1000 == 0;
        });
        return subseq;
    }(seq);
}

template<auto I> 
auto calculation2(){
    return consume_sequence2<make_sequence<I>()>();
}

int main(){
    calculation2<10000>();
}

The code has the same runtime behavior as before. However this time, since consume_sequence2 is not marked consteval, the compiler may emit an unusally long symbol to the object file. For example in gcc14.1 x86_64, a huge symbol starting with _ZZZ17consume_sequence2ITnDaXtlSt16integer_sequenceImJLm0ELm1ELm2ELm3(demo) is emitted to object file, along with a few other ones related to the lambda. This not only makes the final object file unnecessarily large, it also increases the compile time significantly. If we change the 10000 to some even greater number, you'll find that the compilation seems stuck but that's not because there are very complex compile-time computations, but instead due to disk I/O when writing the symbol name.

For my test under Linux in the above case, since Linux itself has no limit on the length of a single symbol, once I increase the 10000 to some large number, GCC seems to happily write out a several gigabyte object file that actually does nothing useful only because of the long symbol name that should otherwise have been eliminated.

I would rather avoid all these from the beginning. Of course you can say "just be careful and don't forget constevals",but that is not really an answer. So I am looking for whether there is a GCC compiler option (I'm interested in Clang and MSVC as well, but I'm afraid that would make the question too broad) that gives a warning or even an error if an unusually long (preferably, this length limit should be set by the user) symbol name is going to be emitted to the binary. I believe similar situations are not uncommon in metaprogramming.

Manducate answered 3/6 at 5:20 Comment(6)
You are asking "Can I get the compiler". Instead, why not write your own tool to do what you want? A simple nm -D ./prog | awk '{if (length($0) > 200) print "WARN"}?. GCC seems to happily write out a several gigabyte object file or check the object size find -name *.o -size +1G -exec echo WARN.Unapt
That would be after the compilation has been done, which is already too late. In my test case, that would be after the several giga-bytes file has been written, which is not very useful. Or do you mean running the awk as a background process while the object file is being written? As I said, I'd like to prevent it from the beginning. I am aware that there are multiple ways to detect it after it has already occured.Manducate
Looks like clang won't compile this at all, not even with consteval. godbolt.org/z/T9Kaqrsr8. fatal error: instantiating fold expression with 9999 arguments exceeded expression nesting limit of 256. I guess you have to increase -fbracket-depth and then it compiles both versions.Waal
Good to know that. For -fbracket-depth one may check #73380500Manducate
I'm very curious why this question has a close-vote for "opinion-based: :) I can't for the life of me figure it out (for any revision of this question). Perhaps it was a misclick?Arriviste
@Arriviste Now there are two. I genuinely cannot figure out why and really hope someone can enlighten me.Manducate

© 2022 - 2024 — McMap. All rights reserved.