What does clang++'s option -fno-strict-enums do?
Asked Answered
W

1

7

Two months ago, I reported, as a clang++ bug, that the C++ program below sets z to 4294967295 when compiled with clang++ -O2 -fno-strict-enums.

enum e { e1, e2 } e;

long long x, y, z;

char *p;

void f(void) {
    e = (enum e) 4294967295;
    x = (long long) e;
    y = e > e1;
    z = &p[e] - p;
}

My bug report was closed as invalid because the program is undefined. My feeling was that using the option -fno-strict-enums made it defined.

As far as I know, Clang does not have documentation worthy of the name, because it aims at being compatible with GCC with respect to the options it accepts and their meaning. I read GCC's documentation of the option -fno-strict-enums as saying that the program should set the value of z to -1:

-fstrict-enums

Allow the compiler to optimize using the assumption that a value of enumerated type can only be one of the values of the enumeration (as defined in the C++ standard; basically, a value that can be represented in the minimum number of bits needed to represent all the enumerators). This assumption may not be valid if the program uses a cast to convert an arbitrary integer value to the enumerated type.

Note that only the option -fstrict-enums is documented, but it seems clear enough that -fno-strict-enums disables the compiler behavior that -fstrict-enums enables. I cannot file a bug against GCC's documentation, because generating a binary that sets z to -1, what I understand -fno-strict-enums to mandate, is exactly what g++ -O2 -fno-strict-enums does.

Could anyone tell me what -fno-strict-enums does in Clang (and in GCC if I have misunderstood what it does in GCC), and whether the value of the option has any effect at all anywhere in Clang?

For reference, my bug report is here and the Compiler Explorer link showing what I mean is here. The versions used as reference are Clang 10.0.1 and GCC 10.2 targeting an I32LP64 architecture.

Woodwind answered 7/11, 2020 at 14:40 Comment(12)
@JaMiT I'm sorry, which uninitialized variable? As for the second potential UB, overflow in conversions are implementation-defined, so in (enum e) 4294967295 I expect the implementation-defined behavior to be applied when converting 4294967295 to the underlying type of enum e. The implementation-defined behavior is warp-around.Woodwind
@JaMiT Variables in namespace scope (like e.g. global variables) without an explicit initializer will be "zero" initialized.Insnare
@JaMiT Well the compiler has to generate code for f that behaves correctly for all initial values of p that make the source code of f defined. If you insist on seeing a calling context for f that does not make the question meaningless, it can be p = malloc(5000000000U); if (!p) abort(); f();Woodwind
@JaMiT If you have an argument explaining that the function f is undefined for all possible calling contexts, that would also answer my question. But I don't think that “p would have to point to an array of 4294967294 elements” is that argument, because p can do just that and the code generated by the compiler has to behave correctly when p does that.Woodwind
@Eljay My question is “what does the clang++ option -fno-strict-enums do? Does it do anything?”. If you are convinced of this, perhaps you know the answer to that question?Woodwind
-fno-strict-enums disables optimizations based on the strict definition of an enum’s value range. But violating ISO 14882 on the valid range of the enum is still undefined behavior.Becalm
I've realized that my earlier comments were just echoes of the noise in the question, so I've retracted (deleted) them. I now see that the question is "what does -fno-strict-enums do in Clang?" and that the rant about the bug report is just background noise that should be ignored.Recency
Your program is undefined because sizeof(e) is not required to be big enough to hold the value you're storing into it. That's why your bug was closed.Myo
@NicolBolas You think that [conv.integral] does not apply to a conversion to an enum type?Woodwind
@PascalCuoq: Enumerations are not integer types. They can be promoted to integer types, but by themselves, they aren't integer types. And yes, the conversion is undefined.Myo
@NicolBolas Thank you for this reference, I am no good at navigating the C++ standard.Woodwind
I erred when I provided an example of context in which the function f would be defined. I expect e to evaluate to -1 in the expression &p[e], and for this reason an example of valid context in which to call f is char c; p = &c + 1; f();.Woodwind
R
4

The effect of -fno-strict-enums is to cancel -fstrict-enums. That is, the compiler is not allowed to optimize using the assumption that a value of enumerated type can only be one of the values of the enumeration. I would like to emphasize that the word choice is "allowed", not "required". It can be difficult to see the impact of no longer allowing something that was not done in the first place. Still, I think I've found an example where this can be seen.

First, I would like to clarify "the values of the enumeration" in the context of the question. The enumeration e has two enumerators, with the values 0 and 1. The smallest number of bits required to represent these values is 1. Thus, the values of the enumeration are all values that can be represented by 1 bit. This happens to coincide with the values of the enumerators in this case, but is not guaranteed in other examples.

Next, let's remove one line from the question's code.

enum e { e1, e2 } e;

long long x, y, z;

char *p;

void f(void) {
    //e = (enum e) 4294967295;
    x = (long long) e;
    y = e > e1;
    z = &p[e] - p;
}

The line I removed interferes with the strict-enum flag. That flag allows the compiler to make an assumption that is not necessary when the compiler knows exactly what the value of e is. The compiler can reasonably choose to not assume that e can hold only 0 or 1 when quite clearly it was just given a different value. (This interference is not dependent upon 4294967295 being too large for a 32-bit signed integer, but merely on 4294967295 being a compile-time value. As another example, assigning (enum e) 2 to e would also cause this interference.)

Focus on the assignment y = e > e1. If -fno-strict-enums is in effect, the only optimization available is to replace e1 with 0. However, if we can assume that e can be only 0 or 1 (the values of the enumeration, which happen to also be the values of the enumerators), another optimization becomes available.

If e is 0, the following have the same value:

  • (long long) (e > e1)
  • (long long) (0 > 0)
  • (long long) false
  • (long long) e

If e is 1, the following have the same value:

  • (long long) (e > e1)
  • (long long) (1 > 0)
  • (long long) true
  • (long long) e

In either case, we can skip the comparison and simply cast e to a long long. This is reflected in the assembly generated by clang 10 for the line y = e > e1.

With -fstrict-enums

movq    %rax, y(%rip)

With -fno-strict-enums

xorl    %ecx, %ecx
testl   %eax, %eax
setg    %cl
movq    %rcx, y(%rip)

An optimization has been made with -fstrict-enums that was not allowed with -fno-strict-enums.

Recency answered 7/11, 2020 at 16:8 Comment(7)
I understand your answer to be “-fno-strict-enums does not allow programs that do (enum e) 4294967295, it only disables [one] optimization that relies on the program not doing this. The program is still invalid if it does this”, and this may be an accurate description of what is happening in Clang, but this is not how GCC's documentation works for, for instance, -fno-strict-aliasing. If you take that option: again, only -fstrict-aliasing is documented, again, the documentation starts “ Allow the compiler to assume the strictest aliasing rules applicable…”, but what this means is…Woodwind
…that, at least in the case of GCC, programs that violate strict aliasing but no other rules should be translated according to the intentions of the programmer. That option would be completely unusable if it just disabled one optimization based on strict aliasing but not others.Woodwind
@PascalCuoq No, that does not look like my answer. I wrote nothing about what programs are allowed. One reason for that is that neither -fno-strict-enums nor fstrict-enums (nor fno-strict-aliasing nor any other flag controlling which optimizations are allowed) will cause an invalid program to become valid. An optimization might cause an invalid program to behave as intended, but that is as much a matter of luck as when undefined behavior behaves as intended.Recency
@PascalCuoq You earlier insisted that your question is about what -fno-strict-enums does. Not why your example is invalid, but what -fno-strict-enums does (and I complied by dropping discussion of why your code is or is not invalid). Please try to approach answers with an open mind, dropping your preconceived idea that somehow it relates to the validity of a program.Recency
I can assure you that the secondary explanations of -fno-strict-aliasing (for that option there are plenty, unlike -fno-strict-enums) describe it as an option that change the dialect accepted by the compiler, making things that are ordinarily UB defined (which a compiler is allowed to do). The developers of GCC have certainly seen these secondary sources and would have had ample time to correct the misunderstanding it if wasn't how one is supposed to understand and use -fno-strict-aliasing. Apart from this, I am approaching answers with an open mind and I am keen to understand.Woodwind
What if I declare enum e { e1 = 100, e2 = 200 };? Will this take 1 bit or 8 bit? Can I safely use -fstrict-enums with this kind of enum?Crispas
@SouravKannanthaB Apply your case to what I wrote: The enumeration e has two enumerators, with the values 100 and 200. The smallest number of bits required to represent these values is 8. Thus, the values of the enumeration are all values that can be represented by 8 bits. (Note: "to represent" the values, not "to count" the values.) As for safety, that depends on the rest of your program. The values of your enumeration are 0 to 255; whether or not your program respects this is not something that can be known from just the enumeration's definition.Recency

© 2022 - 2024 — McMap. All rights reserved.