Why does Visual Studio compile this function correctly without optimisation, but incorrectly with optimisation?
Asked Answered
B

2

5

I'm doing some experimenting with y-combinator-like lambda wrapping (although they're not actually strictly-speaking y-combinators, I know), and I've encountered a very odd problem. My code operates exactly as I'd anticipate in a Debug configuration (with Optimizations turned off), but skips large (and important!) bits in Release (with it set to Optimizations (Favor Speed) (/Ox)).

Please note, the insides of the lambda functions are basically irrelevant, they're just to be sure that it can recursion correctly etc.

// main.cpp
#include <iostream>
#include <string>
#define uint unsigned int

// Defines a y-combinator-style thing to do recursive things. Includes a system where the lambda can declare itself to be obsolete.
// Yes, it's hacky and ugly. Don't worry about it, this is all just testing functionality.
template <class F>
class YCombinator {
public:
    F m_f; // the lambda will be stored here
    bool m_selfDestructing = false; //!< Whether the combinator will self-destruct should its lambda mark itself as no longer useful.
    bool m_selfDestructTrigger = false; //!< Whether the combinator's lambda has marked itself as no longer useful.

    // a forwarding operator:
    template <class... Args>
    decltype(auto) evaluate(Args&&... args) {
        // Avoid storing return if we can, 
        if (!m_selfDestructing) {
            // Pass itself to m_f, then the arguments.
            return m_f(*this, std::forward<Args>(args)...);
        }
        else {
            // Pass itself to m_f, then the arguments.
            auto r = m_f(*this, std::forward<Args>(args)...);
            // self-destruct if necessary, allowing lamdas to delete themselves if they know they're no longer useful.
            if (m_selfDestructTrigger) {
                delete this;
            }
            return r;
        }
    }
};
template <class F> YCombinator(F, bool sd)->YCombinator<F>;

// Tests some instances.
int main() {
    // Most basic test
    auto a = YCombinator{
        [](auto & self, uint in)->uint{
            uint out = in;
            for (uint i = 1u; i < in; ++i) {
                out += self.evaluate(i);
            }
            return out;
        },
        false
    };

    // Same as a, but checks it works as a pointer.
    auto b = new YCombinator{
        [](auto & self, uint in)->uint {
            uint out = in;
            for (uint i = 0u; i < in; ++i) {
                out += self.evaluate(i);
            }

            return out;
        },
        false
    };

    // c elided for simplicity

    // Checks the self-deletion mechanism
    auto d = new YCombinator{
        [&a, b](auto & self, uint in)->uint {
            std::cout << "Running d(" << in << ") [SD-" << self.m_selfDestructing << "]..." << std::endl;

            uint outA = a.evaluate(in);
            uint outB = b->evaluate(in);

            if (outA == outB)
                std::cout << "d(" << in << ") [SD-" << self.m_selfDestructing << "] confirmed both a and b produced the same output of " << outA << "." << std::endl;

            self.m_selfDestructTrigger = true;

            return outA;
        },
        true
    };

    uint resultA = a.evaluate(4u);
    std::cout << "Final result: a(4) = " << resultA << "." << std::endl << std::endl;

    uint resultB = (*b).evaluate(5u);
    std::cout << "Final result: b(5) = " << resultB << "." << std::endl << std::endl;

    uint resultD = d->evaluate(2u);
    std::cout << "Final result: d(2) = " << resultD << "." << std::endl << std::endl;

    resultD = d->evaluate(2u);
    std::cout << "Final result: d(2) = " << resultD << "." << std::endl << std::endl;
}

What should happen is that the first evaluation of d works fine, sets d.m_selfDestructTrigger, and causes itself to be deleted. And then the second evaluation of d should crash, because d no longer really exists. Which is exactly what happens in the Debug configuration. (Note: As @largest_prime_is_463035818 points out below, it shouldn't crash so much as encounter undefined behaviour.)

But in the Release configuration, as far as I can tell, all of the code in evaluate gets skipped entirely, and the execution jumps straight to the lambda. Obviously, break-points in optimised code are a little suspect, but that appears to be what's happening. I've tried rebuilding the project, but no dice; VS seems pretty adamant about it.

Am I crazy? Is there something I've missed? Or is this an actual bug in VS (or even the compiler)? Any assistance in determining if this is a code issue or a tool issue would be greatly appreciated.

Note: I'm on VS2019 16.8.3, using the /std:c++ latest featureset.

Brasilein answered 4/2, 2021 at 17:54 Comment(7)
Have you tried it in Godbolt?Expunction
"should crash" ? are you trying to get defined behavior from undefined behavior?Schrader
@largest_prime_is_463035818 Point taken, but, as it happens, that's not the issue. The issue is that the code to delete itself before that point is never called.Brasilein
Expanding on what largest prime said, it seems that this question boils down to "why is the compiler omitting code that cannot possibly ever be run, assuming that the program is defined?"Salzman
the thing is UB can do timetravel. If your code is A;B; and B is undefined, then compiler can do what they like with A tooSchrader
tough, I don't understand your code. My comments were only based on your "should crash, because d no longer really exists" which is a false premise, but I am not sure if that really explains your observations. Afaik nothing in the C++ standard guarantees that a program crashes under any circumstancesSchrader
You may find this unrelated question helpful for understanding how the behavior of seemingly well-defined parts of a program can be affected by undefined behavior later on. Bottom line: UB anywhere is UB everywhere.Salzman
H
2

Problem

The delete this in your code is called in both cases regardless of the optimization options.

        if (m_selfDestructTrigger) {
            delete this;
        }

In your code, the "b" object gets deleted, but then you "evaluate()" it, and it causes the Access Violation, because you are using heap that was already freed.

It gives the Access Violation error in both Release and Debug configurations in my case, but in your case the Access Violation may not happen with optimizations for the following reason.

There may be cases such as in your one, when using freed heap does not cause the error and you have the impression that the program works fine (as with optimizations or in the Release configuration), because the freed heap is not cleared and keeps the old object.

It is not the compiler error but the way how you delete objects.

It is generally a bad style for the objects to delete themselves because you may then reference a deleted object, as was in your case. There is a discussion on whether objects should delete themselves.

If you comment the "delete" line, you code will run without Access Violation. You can use a simpler method to debug an application if you still suspect it might have been a compiler error and "the execution jumps straight to the lambda". This simpler method is to avoid the "delete" and output some text instead from the blocks of code that you suspect are skipped by the compiler.

Solution

You can use another compiler, particularly, clang with sanitizers to make sure that it is not a error of the Microsoft Visual Studio compiler that you are using.

For example, use:

clang++.exe -std=c++20 -fsanitize=address calc.cpp 

and run the resulting executable file.

In this example, your code is compiled with "Address Sanitizer", which is a memory error detector supported by this compiler. Using various sanitizers may help you in debugging your C/C++ programs in future.

You will get an error like this that shows that you are using heap after freeing it:

=================================================================
==48820==ERROR: AddressSanitizer: heap-use-after-free on address 0x119409fa0380 at pc 0x7ff799c91d6c bp 0x004251cff720 sp 0x004251cff768
READ of size 1 at 0x119409fa0380 thread T0
    #0 0x7ff799c91d6b in main+0xd6b (c:\calc\clang\calc.exe+0x140001d6b)
    #1 0x7ff799c917de in main+0x7de (c:\calc\clang\calc.exe+0x1400017de)
    #2 0x7ff799cf799f in __scrt_common_main_seh d:\agent\_work\63\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
    #3 0x7ffe3cff53fd in BaseThreadInitThunk+0x1d (C:\WINDOWS\System32\KERNEL32.DLL+0x1800153fd)
    #4 0x7ffe3ddc590a in RtlUserThreadStart+0x2a (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18006590a)

0x119409fa0380 is located 16 bytes inside of 24-byte region [0x119409fa0370,0x119409fa0388)
freed by thread T0 here:
    #0 0x7ff799cf6684 in operator delete C:\src\llvm_package_6923b0a7\llvm-project\compiler-rt\lib\asan\asan_new_delete.cpp:160
    #1 0x7ff799c91ede in main+0xede (c:\calc\clang\calc.exe+0x140001ede)
    #2 0x7ff799c916e4 in main+0x6e4 (c:\calc\clang\calc.exe+0x1400016e4)
    #3 0x7ff799cf799f in __scrt_common_main_seh d:\agent\_work\63\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
    #4 0x7ffe3cff53fd in BaseThreadInitThunk+0x1d (C:\WINDOWS\System32\KERNEL32.DLL+0x1800153fd)
    #5 0x7ffe3ddc590a in RtlUserThreadStart+0x2a (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18006590a)

Proof

You can also use the following batch file to compare the output of the two version compiled with and without optimization by clang to find out that they produce the same results:

clang++ -std=c++20 -O3 -o calc-O3.exe calc.cpp
clang++ -std=c++20 -O0 -o calc-O0.exe calc.cpp
calc-O3.exe > calc-O3.txt
calc-O0.exe > calc-O0.txt
fc calc-O3.txt calc-O0.txt

It will give the following:

Comparing files calc-O3.txt and calc-O0.txt
FC: no differences encountered

For the Microsoft Visual Studio compiler, use the following batch file:

cl.exe /std:c++latest /O2 /Fe:calc-O3.exe calc.cpp
cl.exe /std:c++latest /Od /Fe:calc-O0.exe calc.cpp
calc-O3.exe > calc-O3.txt
calc-O0.exe > calc-O0.txt
fc calc-O3.txt calc-O0.txt

It also produce identical results, so the code runs the same way regardless of optimization (rather than "all of the code in evaluate gets skipped entirely" as you wrote) - you have probably debugged it incorrectly due to optimizations.

Holdall answered 4/2, 2021 at 18:28 Comment(0)
S
5

Undefined behavior is a non-localized phenomenon. If your program encounters UB, that means the program's behavior in its entirety is undefined, not merely that little part of it that did the bad thing.

As such, it is possible for UB to "time travel", affecting code that theoretically should have executed correctly before doing UB. That is, there is no "correctly" in a program that exhibits UB; either the program is correct, or it is incorrect.

How far that goes depends on the implementation, but as far as the standard is concerned, VS's behavior is consistent with the standard.

Sibel answered 4/2, 2021 at 18:14 Comment(0)
H
2

Problem

The delete this in your code is called in both cases regardless of the optimization options.

        if (m_selfDestructTrigger) {
            delete this;
        }

In your code, the "b" object gets deleted, but then you "evaluate()" it, and it causes the Access Violation, because you are using heap that was already freed.

It gives the Access Violation error in both Release and Debug configurations in my case, but in your case the Access Violation may not happen with optimizations for the following reason.

There may be cases such as in your one, when using freed heap does not cause the error and you have the impression that the program works fine (as with optimizations or in the Release configuration), because the freed heap is not cleared and keeps the old object.

It is not the compiler error but the way how you delete objects.

It is generally a bad style for the objects to delete themselves because you may then reference a deleted object, as was in your case. There is a discussion on whether objects should delete themselves.

If you comment the "delete" line, you code will run without Access Violation. You can use a simpler method to debug an application if you still suspect it might have been a compiler error and "the execution jumps straight to the lambda". This simpler method is to avoid the "delete" and output some text instead from the blocks of code that you suspect are skipped by the compiler.

Solution

You can use another compiler, particularly, clang with sanitizers to make sure that it is not a error of the Microsoft Visual Studio compiler that you are using.

For example, use:

clang++.exe -std=c++20 -fsanitize=address calc.cpp 

and run the resulting executable file.

In this example, your code is compiled with "Address Sanitizer", which is a memory error detector supported by this compiler. Using various sanitizers may help you in debugging your C/C++ programs in future.

You will get an error like this that shows that you are using heap after freeing it:

=================================================================
==48820==ERROR: AddressSanitizer: heap-use-after-free on address 0x119409fa0380 at pc 0x7ff799c91d6c bp 0x004251cff720 sp 0x004251cff768
READ of size 1 at 0x119409fa0380 thread T0
    #0 0x7ff799c91d6b in main+0xd6b (c:\calc\clang\calc.exe+0x140001d6b)
    #1 0x7ff799c917de in main+0x7de (c:\calc\clang\calc.exe+0x1400017de)
    #2 0x7ff799cf799f in __scrt_common_main_seh d:\agent\_work\63\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
    #3 0x7ffe3cff53fd in BaseThreadInitThunk+0x1d (C:\WINDOWS\System32\KERNEL32.DLL+0x1800153fd)
    #4 0x7ffe3ddc590a in RtlUserThreadStart+0x2a (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18006590a)

0x119409fa0380 is located 16 bytes inside of 24-byte region [0x119409fa0370,0x119409fa0388)
freed by thread T0 here:
    #0 0x7ff799cf6684 in operator delete C:\src\llvm_package_6923b0a7\llvm-project\compiler-rt\lib\asan\asan_new_delete.cpp:160
    #1 0x7ff799c91ede in main+0xede (c:\calc\clang\calc.exe+0x140001ede)
    #2 0x7ff799c916e4 in main+0x6e4 (c:\calc\clang\calc.exe+0x1400016e4)
    #3 0x7ff799cf799f in __scrt_common_main_seh d:\agent\_work\63\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
    #4 0x7ffe3cff53fd in BaseThreadInitThunk+0x1d (C:\WINDOWS\System32\KERNEL32.DLL+0x1800153fd)
    #5 0x7ffe3ddc590a in RtlUserThreadStart+0x2a (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18006590a)

Proof

You can also use the following batch file to compare the output of the two version compiled with and without optimization by clang to find out that they produce the same results:

clang++ -std=c++20 -O3 -o calc-O3.exe calc.cpp
clang++ -std=c++20 -O0 -o calc-O0.exe calc.cpp
calc-O3.exe > calc-O3.txt
calc-O0.exe > calc-O0.txt
fc calc-O3.txt calc-O0.txt

It will give the following:

Comparing files calc-O3.txt and calc-O0.txt
FC: no differences encountered

For the Microsoft Visual Studio compiler, use the following batch file:

cl.exe /std:c++latest /O2 /Fe:calc-O3.exe calc.cpp
cl.exe /std:c++latest /Od /Fe:calc-O0.exe calc.cpp
calc-O3.exe > calc-O3.txt
calc-O0.exe > calc-O0.txt
fc calc-O3.txt calc-O0.txt

It also produce identical results, so the code runs the same way regardless of optimization (rather than "all of the code in evaluate gets skipped entirely" as you wrote) - you have probably debugged it incorrectly due to optimizations.

Holdall answered 4/2, 2021 at 18:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.