Why does this C-style cast not consider static_cast followed by const_cast?
Asked Answered
M

2

17

Consider:

float const& f = 5.9e-44f;
int const i = (int&) f;

Per expr.cast/4 this should be considered as, in order:

  • a const_­cast,
  • a static_­cast,
  • a static_­cast followed by a const_­cast,
  • a reinterpret_­cast, or
  • a reinterpret_­cast followed by a const_­cast,

Clearly a static_­cast<int const&> followed by a const_­cast<int&> is viable and will result in an int with value 0. But all compilers instead initialize i to 42, indicating that they took the last option of reinterpret_­cast<int const&> followed by const_­cast<int&>. Why?

Related: In C++, can a C-style cast invoke a conversion function and then cast away constness?, Why is (int&)0 ill-formed?, Does the C++ specification say how types are chosen in the static_cast/const_cast chain to be used in a C-style cast?, Type punning with (float&)int works, (float const&)int converts like (float)int instead?

Milzie answered 26/3, 2021 at 12:13 Comment(11)
Converting a float to an int is not the same thing as converting a reference to a float into a reference to an int. Whether the reference is const-qualified is immaterial.Ortensia
const is a red herring and distracts from the real problem. Consider this simplified complete example : godbolt.org/z/oaxz31j99Pentothal
I think it has something to do with lvalue reference to non-const not being allowed to be bound to prvalues.Xylograph
@FrançoisAndrieux const is necessary to allow a static_cast chain to work; you need to construct a temporary.Milzie
@Milzie I think I understand. You would expect (int&) to reinterpret_cast for a float f; but expect it to static_cast + const_cast for a const float & f because in the second case f is a reference type?Pentothal
@FrançoisAndrieux actually I'd expect it to static_cast + const_cast in both cases, since there is a viable path (via int const& binding to a temporary int). But in the latter case it should be blatantly obvious to the compiler that the path exists, since it's the exact same sequence of types with the reinterpret_cast.Milzie
@Xylograph it is allowed, though; see godbolt.org/z/xbfPfMbj3Milzie
@Milzie No prvalue is bound to a reference to non-const in your example with two casts.Xylograph
@Xylograph oh, my bad. So if the static_cast followed by const_cast is valid, why aren't the compilers selecting it?Milzie
@user17732522 you linked to this question - did you mean to link to a different one?Milzie
Oops, I meant that this is related, although the answer here is better and probably applies as well: https://mcmap.net/q/512198/-why-is-int-amp-0-ill-formedEsperanto
C
6

tl;dr:

  • const_cast<int&>(static_cast<int const&>(f)) is valid c++
  • (int&)f should have the same result
  • but it doesn't due to an ancient compiler bug that never got fixed

Long Explanation

1. why const_cast<int&>(static_cast<int const&>(f)) works

1.1 the static_cast

Let's start with the static_cast<int const&>(f):

  • Let's check what the result of that cast would be:
    7.6.1.9 Static cast (emphasis mine)

    (1) The result of the expression static_­cast<T>(v) is the result of converting the expression v to type T. If T is an lvalue reference type or an rvalue reference to function type, the result is an lvalue; if T is an rvalue reference to object type, the result is an xvalue; otherwise, the result is a prvalue. The static_­cast operator shall not cast away constness (expr.const.cast).

    int const& is an lvalue reference type, so the result of the static_cast<>() must be some sort of lvalue.

  • Then let's find out what conversion actually happens:
    7.6.1.9 Static cast

    (4) An expression E can be explicitly converted to a type T if there is an implicit conversion sequence (over.best.ics) from E to T, [...].
    If T is a reference type, the effect is the same as performing the declaration and initialization
    T t(E);
    for some invented temporary variable t ([dcl.init]) and then using the temporary variable as the result of the conversion.

    • In our case the declaration would look like this:
      const int& t(f);
    • I'm not going to elaborate the entire conversion process here to keep it short, you can read the exact details in 12.2.4.2 Implicit conversion sequences
    • In our case the conversion sequence would consist of 2 steps:
      • convert the glvalue float to a prvalue (this also allows us to get rid of const)
        7.3.2 Lvalue-to-rvalue conversion (emphasis mine)

        (1) A glvalue of a non-function, non-array type T can be converted to a prvalue. If T is an incomplete type, a program that necessitates this conversion is ill-formed. If T is a non-class type, the type of the prvalue is the cv-unqualified version of T. Otherwise, the type of the prvalue is T.

        Given that float is of non-class type, this allows us to convert f from float const& to float&&.

      • convert from float to int
        7.3.11 Floating-integral conversions

        (1) A prvalue of a floating-point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type.

        So we end up with a nicely converted int value from f.

  • So the final result of the static_cast<> part is an lvalue int const&.

1.2 the const_cast

Now that we know what the static_cast<> part returns, we can focus on the const_cast<int&>():

  • The result type needs to be:
    7.6.1.11 Const cast (emphasis mine)

    (1) The result of the expression const_­cast<T>(v) is of type T. If T is an lvalue reference to object type, the result is an lvalue; if T is an rvalue reference to object type, the result is an xvalue; otherwise, the result is a prvalue and the lvalue-to-rvalue, array-to-pointer, and function-to-pointer standard conversions are performed on the expression v. Conversions that can be performed explicitly using const_­cast are listed below. No other conversion shall be performed explicitly using const_­cast.

    The static_cast<> resulted in an lvalue, so the result of the const_cast<> must also be an lvalue.

  • What conversion does the const_cast<> do? 7.6.1.11 Const cast (emphasis mine)

    (4) For two object types T1 and T2, if a pointer to T1 can be explicitly converted to the type “pointer to T2” using a const_­cast, then the following conversions can also be made:
    (4.1) an lvalue of type T1 can be explicitly converted to an lvalue of type T2 using the cast const_­cast<T2&>;
    (4.2) a glvalue of type T1 can be explicitly converted to an xvalue of type T2 using the cast const_­cast<T2&&>; and
    (4.3) if T1 is a class type, a prvalue of type T1 can be explicitly converted to an xvalue of type T2 using the cast const_­cast<T2&&>.

    The result of a reference const_­cast refers to the original object if the operand is a glvalue and to the result of applying the temporary materialization conversion otherwise.

    So the const_cast<> will convert the lvalue const int& to an int& lvalue, which will refer to the same object.

1.3 conclusion

const_cast<int&>(static_cast<int const&>(f)) is well-formed and will result in a lvalue int reference.

You can even extend the lifetime of the reference as per 6.7.7 Temporary objects

(6) The temporary object to which the reference is bound or the temporary object that is the complete object of a subobject to which the reference is bound persists for the lifetime of the reference if the glvalue to which the reference is bound was obtained through one of the following:
[...]
- (6.6) a
- (6.6.1) const_cast (expr.const.cast),
[...]
converting, without a user-defined conversion, a glvalue operand that is one of these expressions to a glvalue that refers to the object designated by the operand, or to its complete object or a subobject thereof,
[...]

So this would also be legal:

float const& f = 1.2f; 
int& i = const_cast<int&>(static_cast<int const&>(f));

i++; // legal
return i; // legal, result: 2
1.4 notes
  • It is irrelevant in this case that the operand of static_cast<> is a const float reference, since the lvalue-to-rvalue conversion that static_cast is allowed to perform can strip away const.
    So those would also be legal:
    int& i = const_cast<int&>(static_cast<int const&>(1.0f));
    // when converting to rvalue you don't even need a const_cast:
    // (due to 7.6.1.9 (4), because int&& t(1.0f); is well-formed)
    // the result of the static_cast would be an xvalue in this case. 
    int&& ii = static_cast<int&&>(1.0f);
    
  • Because of that the following c-style casts are also well-formed:
    float f = 1.2f;
    int const& i = (int const&)f; // legal, will use static_cast
    int&& ii = (int&&)f; // legal, will use static_cast
    

2. why (int&)f doesn't work

You're technically correct in that it should work, because a c-style cast is allowed to perform this conversion sequence:

7.6.3 Explicit type conversion (cast notation)

(4) The conversions performed by
(4.1) a const_­cast (expr.const.cast),
(4.2) a static_­cast (expr.static.cast),
(4.3) a static_­cast followed by a const_­cast,
(4.4) a reinterpret_­cast (expr.reinterpret.cast), or
(4.5) a reinterpret_­cast followed by a const_­cast,
can be performed using the cast notation of explicit type conversion. The same semantic restrictions and behaviors apply, [...].

So const_cast<int&>(static_cast<int const&>(f)) should definitely be a valid conversion sequence.

The reason why this doesn't work is actually a very, very old compiler bug.

2.1 It's even an open-std.org issue (#909):

According to 7.6.3 [expr.cast] paragraph 4, one possible interpretation of an old-style cast is as a static_cast followed by a const_cast. One would therefore expect that the expressions marked #1 and #2 in the following example would have the same validity and meaning:

struct S {
  operator const int* ();
};

void f(S& s)  {
  const_cast<int*>(static_cast<const int*>(s));  // #1
  (int*) s;  // #2
}

However, a number of implementations issue an error on #2.

Is the intent that (T*)x should be interpreted as something like const_cast<T*>(static_cast<const volatile T*>(x))

The resultion was:

Rationale (July, 2009): According to the straightforward interpretation of the wording, the example should work. This appears to be just a compiler bug.

So the standard agrees with your conclusion, it's just that no compiler actually implements that interpretation.

2.2 Compiler Bug Tickets

There are already open bugs for gcc & clang regarding this issue:

2.3 why isn't this fixed yet after all those years?

I don't know, but given they have to implement a new standard roughly every 3 years now with tons of changes to the language every time it seems reasonable to ignore issues that most programmers probably won't ever encounter.

Note that this is only a problem for primitive types. My guess is that the reason for the bug is that for those the cv-qualifiers can be dropped by a static_cast / reinterpret_cast due to the lvalue-to-rvalue conversion rule.

If T is a non-class type, the type of the prvalue is the cv-unqualified version of T. Otherwise, the type of the prvalue is T.

Note that this bug only affects non-class types, for class-types it'll work perfectly:

struct B { int i; };
struct D : B {};

D d;
d.i = 12;
B const& ref = d;

// works
D& k = (D&)ref;

There will always be a few edge-cases that are not properly implemented in each & every compiler, if it bothers you you can provide a fix & maybe they'll merge it with the next version (at least for clang & gcc).

2.4 gcc code analysis

In the case of gcc a c-style cast currently gets resolved by cp_build_c_cast:

tree cp_build_c_cast(location_t loc, tree type, tree expr, tsubst_flags_t complain) {
  tree value = expr;
  tree result;
  bool valid_p;
  // [...]
  /* A C-style cast can be a const_cast.  */
  result = build_const_cast_1 (loc, type, value, complain & tf_warning,
                   &valid_p);
  if (valid_p)
    {
      if (result != error_mark_node)
    {
      maybe_warn_about_useless_cast (loc, type, value, complain);
      maybe_warn_about_cast_ignoring_quals (loc, type, complain);
    }
      return result;
    }

  /* Or a static cast.  */
  result = build_static_cast_1 (loc, type, value, /*c_cast_p=*/true,
                &valid_p, complain);
  /* Or a reinterpret_cast.  */
  if (!valid_p)
    result = build_reinterpret_cast_1 (loc, type, value, /*c_cast_p=*/true,
                       &valid_p, complain);
  /* The static_cast or reinterpret_cast may be followed by a
     const_cast.  */
  if (valid_p
      /* A valid cast may result in errors if, for example, a
     conversion to an ambiguous base class is required.  */
      && !error_operand_p (result))
  {
    tree result_type;

    maybe_warn_about_useless_cast (loc, type, value, complain);
    maybe_warn_about_cast_ignoring_quals (loc, type, complain);

    /* Non-class rvalues always have cv-unqualified type.  */
    if (!CLASS_TYPE_P (type))
      type = TYPE_MAIN_VARIANT (type);
    result_type = TREE_TYPE (result);

    if (!CLASS_TYPE_P (result_type) && !TYPE_REF_P (type))
      result_type = TYPE_MAIN_VARIANT (result_type);

    /* If the type of RESULT does not match TYPE, perform a
      const_cast to make it match.  If the static_cast or
      reinterpret_cast succeeded, we will differ by at most
      cv-qualification, so the follow-on const_cast is guaranteed
      to succeed.  */
    if (!same_type_p (non_reference (type), non_reference (result_type)))
    {
      result = build_const_cast_1 (loc, type, result, false, &valid_p);
      gcc_assert (valid_p);
    }

    return result;
  }

  return error_mark_node;
}

The implementation is basically:

  • try a const_cast
  • try a static_cast (while temporarily ignoring potential const mismatches)
  • try a reinterpret_cast (while temporarily ignoring potential const mismatches)
  • if there was a const mismatch in the static_cast or reinterpret_cast variant, slap a const_cast in front of it.

So for some reason build_static_cast_1 doesn't succeed in this case, so build_reinterpret_cast_1 gets to do it's thing (which will result in undefined behaviour due to the strict aliasing rule)

Crowe answered 18/12, 2021 at 0:50 Comment(3)
Amazing. Thank you.Bonner
Great analysis, thanks! Looking at the code you indicate, I think passing through / acting on c_cast_p should fix my issue and the related CWG 909? Something like: github.com/gcc-mirror/gcc/compare/master...ecatmur:so-66816741Milzie
@Milzie You made a fix for it! That's awesome :D I'm unfortunately not very familiar with the gcc code base yet. I compiled your fix & ran the tests, they worked except the constexpr-union.C one - line 16 (reinterpret_cast<> is not allowed in constexpr contexts). But apart from that it looks good :)Crowe
B
-1

This might be undefined behavior. But to, try and answer the question, as far as I know:

You cast const away, then reinterpret_cast it as a int&. (**)
It's not a static_cast?
It's already a reference to a lvalue that isn't pointer-interconvertible to int&. (*)

The result of that reinterpret_cast(?) would be undefined behavior; It'd violate the strict aliasing rule.

You can check that before trying by using std::is_pointer_interconvertible_base_of_v<>. See: cppreference.com

If we ignore const it still doesn't make(s) sense.
The more I keep reading about this, the least certain I become of anything. This is why we tell you not to use c-style casts.

Notes (*): That's wrong, or is it? More than one way to skin this cast…
(**): It's not that… I don't know what I'm saying there…

Bonner answered 13/12, 2021 at 21:52 Comment(6)
"You cast const away, then reinterpret_cast it as a int&." But according to the C++ Standard, a C-style cast performs a reinterpret_cast followed by a const_cast, not the other way round. And that only if a static_cast followed by a const_cast is not viable; but it is viable in this case, as demonstrated.Milzie
You can implicitly add const. Removing it, must be explicit. [expr.static.cast]Bonner
Actually, just read the whole chapter [expr.cast] (as I did, 5 times, yesterday) I'm way too tired to read this small font. Of note is "If a conversion can be interpreted in more than one way as a static_­cast followed by a const_­cast, the conversion is ill-formed."Bonner
OK, so what's that alternate conversion path? Also, if it were ill-formed (note, not ill-formed NDR) then shouldn't this be rejected?Milzie
@ecatmur: "But according to the C++ Standard, a C-style cast performs a reinterpret_cast followed by a const_cast, not the other way round.". You just confuse me, does this means: reintrepret_cast<new-type>(const_cast<new-type>(expression)) or the other way round?Leeland
@cpper no, the other way round - the inner cast occurs before (is followed by) the outer cast.Milzie

© 2022 - 2024 — McMap. All rights reserved.