Using volatile to prevent compiler optimization in benchmarking code?

Asked 25/5, 2011 at 19:55 Answered 25/5, 2011 at 20:29

I am creating a little program measure the performance difference between containers of types boost::shared_ptr and boost::intrusive_ptr. In order to prevent the compiler from optimizing away the copy I declare the variable as volatile. The loop looks like this:

// TestCopy measures the time required to create n copies of the given container.
// Returns time in milliseconds.
template<class Container>
time_t TestCopy(const Container & inContainer, std::size_t n) {
    Poco::Stopwatch stopwatch;
    stopwatch.start();
    for (std::size_t idx = 0; idx < n; ++idx)
    {
        volatile Container copy = inContainer; // Volatile!
    }

    // convert microseconds to milliseconds
    return static_cast<time_t>(0.5 + (double(stopwatch.elapsed()) / 1000.0));
}

The rest of the code can be found here: main.cpp.

Will using volatile here prevent the compiler from optimizing away the copy?
Are there any pitfalls that may invalidate the results?

Update

In response to @Neil Butterworth. Even when using the copy it still seems to me that the compiler could easily avoid the copy:

for (std::size_t idx = 0; idx < n; ++idx)
{
    // gcc won't remove this copy?
    Container copy = inContainer;
    gNumCopies += copy.size();        
}

Showman answered 25/5, 2011 at 19:55 Comment(9)

Does using -O0 and the -g flags in the compiler not work? I don't think using volatile is the right approach here. – Webbed 25/5, 2011 at 19:56

@RC Instead of profiling with -O0 you can just guess the performance impact instead. The result similarly has no bearing on a realistic scenario. – Osugi 25/5, 2011 at 19:58

@RC I'm not sure. I know that optimizations like RVO will kick in even when using -O0 (see #4768120 ). – Showman 25/5, 2011 at 19:59

If the copy constructer has side-effects, is the compiler even allowed to optimize away the copy? – Malcom 25/5, 2011 at 20:0

@Neil, can you elaborate? Specifically, I don't think either condition listed in 12.8/15 applies to this case. There is no return statement, and there is no temporary object. Further, consider ideone.com/aJBr1 . Even with g++ -O4, the copy constructor is called once per loop. (Aside: Yes, if the Container in question has a copy-constructor with no side-effects it can be optimized away, per 1.9/1 and 1.9/6. But, I'm asking about a copy constructor with side effects.) – Malcom 25/5, 2011 at 20:22

@Rob, the compiler can optimize away anything that it can prove won't have an effect on the result of the program. So if you create and then immediately destroy an object, then in principle, the compiler is absolutely allowed to optimize away the whole thing, provided it can prove there are no lasting side effects. – Trawl 25/5, 2011 at 20:40

@Trawl RVO is even allowed when the copy constructor has major side effects (see #4768120 ) – Showman 25/5, 2011 at 20:42

@Trawl - right, and @Showman - right. But I was specifically asking about the case where Container::Container() has side effects. (And RVO doesn't apply here.) – Malcom 25/5, 2011 at 21:19

@Rob, if it has visible side effect, the compiler needs to retain those effects - but not on iota more. Heck, it could (in principle) precompute everything the copy constructor might do that would have lasting effects, then just put a bit of code to reproduce those effects and nothing else. – Trawl 25/5, 2011 at 21:28

The C++03 standard says that reads and writes to volatile data is observable behavior (C++ 2003, 1.9 [intro.execution] / 6). I believe this guarantees that assignment to volatile data cannot be optimized away. Another kind of observable behavior is calls to I/O functions. The C++11 standard is even more unambiguous in this regard: in 1.9/8 it explicitly says that

The least requirements on a conforming implementation are:
— Access to volatile objects are evaluated strictly according to the rules of the abstract machine.

If a compiler can prove that a code does not produce an observable behavior then it can optimize the code away. In your update (where volatile is not used), copy constructor and other function calls & overloaded operators might avoid any I/O calls and access to volatile data, and the compiler might well understand it. However if gNumCopies is a global variable that later used in an expression with observable behavior (e.g. printed), then this code will not be removed.

Kittiekittiwake answered 25/5, 2011 at 20:29 Comment(0)

Why should it? The best solution is to use the container in some way, like by adding its size to a global variable.

Hagen answered 25/5, 2011 at 19:59 Comment(2)

It still seems likely that the compiler could optimize away the copy here. See my edit for a code sample. – Showman 25/5, 2011 at 20:9

OH, OK - then call a function on the copy and the original. – Hagen 25/5, 2011 at 20:28

Volatile is unlikely to do what you expect for a non-POD type. I would recommend passing a char * or void * aliasing the container to an empty function in a different translation unit. Since the compiler is unable to analyze the usage of the pointer, this will act as a compiler memory barrier, forcing the object out to the processor cache at least, and preventing most dead-value-elimination optimizations.

Trawl answered 25/5, 2011 at 20:0 Comment(0)

Update

Recommended topics

Hot tags