What are some examples of non-determinism in the C++ compiler?
Asked Answered
S

1

10

I'm looking for examples of code that triggers non-determinism in GCC or Clang's compilation process.

One prominent example is the usage of the __DATE__ macro.

GCC and Clang have a plethora of compiler flags to control the outcome of non-deterministic actions within the compiler eg. -frandom-seed and -fno-guess-branch-probability

Are there any small examples that are affected by these flags?

To be more precise:

$ c++ main.cpp -o main && shasum main
aabbccddee

$ c++ main.cpp -o main && shasum main
eeddccbbaa

I'm looking for macro-free code examples where multiple runs of the compiler lead to different outputs, but can be fixed by e.g. -frandom-seed

EDIT:

related: from the gcc docs:

-fno-guess-branch-probability:
Sometimes gcc will opt to use a randomized model to guess branch probabilities, 
when none are available from either profiling feedback (-fprofile-arcs) 
or __builtin_expect. 
This means that different runs of the compiler on the same program
may produce different object code.
The default is -fguess-branch-probability at levels -O, -O2, -O3, -Os. 
Sidell answered 24/10, 2018 at 16:50 Comment(12)
As this isn't relating to code that isn't working and needs to be fixed, it's off-topic here on Stack Overflow. You may want to address this to a forum about compilers (e.g. clang or GCC) to get insight from people who actually build and maintain compilers, and where you can have extended discussion on the subject.Rootlet
@Rootlet StackOverflow is also about tools used by developers, so this really does qualify. Please reread stackoverflow.com/help/on-topic.Weismann
@Weismann This is not something you can answer succinctly. It's off-topic and has three votes already. I'm not alone here. I'm not saying it's a bad question, it's just in the wrong place.Rootlet
@Rootlet I honestly couldn't care less how many close votes this got. This is a very answerable question (what code/options trigger this behaviour) about tools commonly used by developers (compilers). I can't answer it. It seems you can't, but that doesn't make this an off-topic question.Weismann
@Rootlet The question asks for some small code examples. That would be a succinct answer in my view.Carlinecarling
@Carlinecarling That's only scratching the surface. To answer this question properly would take substantial effort and dialog. It's a good question, but it's too broad to answer here. Forums allow a lot more back-and-forth and clarification. Stack Overflow isn't supposed to work that way.Rootlet
AFAICT this question is on-topic per the rules described at @rubenvb's link above. In particular, it's "about a software tool commonly used by programmers", and is "a practical, answerable problem that is unique to software development".Otis
I believe that your question is monstrously broad. I recommend that you have a look here: gcc.gnu.org/onlinedocs/gcc/… there are some code examples there. Maybe after you look there and to the associated links from there you can come back and ask a far more limited question.Heard
@Drt None of the paragraphs in your link are remotely related to my question. Could you elaborate why you think my question is broad? - maybe I can clarify my questionSidell
@Drt questioner is asking for "some examples", not "every possible instance". Surely it's not beyond reason to ask for a few examples of a phenomenon?Otis
@MarcGlisse added a snippet from the gcc docsSidell
@Sidell ah, ok, thanks.Polysyllabic
R
4

While old, this question is interesting for reproducible builds.

As you've stated, there are multiple source of non-determinism while compiling some C/C++ source.

Non-determinism in preprocessor

The preprocessor usually implements some numerous super macro which are changing between runs. There's the obvious __DATE__ and __TIME__ but also the non obvious __cplusplus or __STD_C_VERSION__ or __GNUC_PATCHLEVEL__ which can changes when the OS updates.

There's also the __FILE__ that will contain the path of the building environment (different from machine to machine).

Please notice that for the former macro, GCC observes the environment variable SOURCE_DATE_EPOCH to overwrite the date and time macro. Other compilers might have some other behavior.

Non-determinism in the compiler

The compiler might have different optimization strategies based on non-deterministic approach. You've cited one in GCC, but other might exists. For MSVC, you might be interested in the /BREPRO compiler flag.

You'll have to RTFM for your compiler to know more.

Non-determinism in the linker

On some architecture, the linked object and/or library will contain a timestamp. MacOS is one of them. So for the same set of .o files, you'll get a different resulting executable.

Also, if you use Link Time Optimization, many compiler will create different versions of the .o files named randomly. Again for GCC, you'll use -frandom-seed=31415 to "fix" this randomness, but YMMV.

Non-determinism in the build-process

Sometimes repositories contain additional operation that are performed outside of the compilation stage. Like generating header files based on some configuration flags (or other steps). In that case, this per-project's specific operations might not be deterministic either.

For a good overview of the deterministic builds, please refer to this post

Resurge answered 15/3, 2022 at 15:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.