How to find C++ spurious copy operations?

Asked 2/1, 2020 at 8:38 Answered 14/1 at 5:33

c++code-analysis static-code-analysis cppcheck

Recently, I had the following

struct data {
  std::vector<int> V;
};

data get_vector(int n)
{
  std::vector<int> V(n,0);
  return {V};
}

The problem with this code is that when the struct is created a copy occurs and the solution is instead to write return {std::move(V)}

Are there linter or code analyzer that would detect such spurious copy operations? Neither cppcheck, cpplint, nor clang-tidy can do it.

EDIT: Several points to make my question clearer:

I know that a copy operation occurred because I used compiler explorer and it shows a call to memcpy.
I could identify that a copy operations occurred by looking at the standard yes. But my initial wrong idea was that the compiler would optimize away this copy. I was wrong.
It is (likely) not a compiler problem since both clang and gcc produce code that produce a memcpy.
The memcpy may be cheap, but I cannot imagine circumstances where copying memory and deleting the original is cheaper than passing a pointer by a std::move.
The adding of the std::move is an elementary operation. I would imagine that a code analyzer would be able to suggest this correction.

Mandamandaean answered 2/1, 2020 at 8:38 Comment(7)

I can't answer whether or not there exists any method/tool for detecting "spurious" copy operations, however, in my honest opinion, I disagree that the copying of the std::vector by any means is not being what it purports to be. Your example shows an explicit copy, and it is only natural, and the correct approach, (again imho) to apply the std::move function as you suggest yourself if a copy is not what you want. Note that some compilers may omit the copying if optimizations flags are turned on, and the vector is unchanged. – Dihybrid 2/1, 2020 at 8:59

I fear there are too much unnecessary copies (which might not be impacting) to make this linter rule usable :-/ (rust uses move by default so requires explicit copy :) ) – Healthful 2/1, 2020 at 9:9

My suggestions for optimize code is basically to disassemble the function you want to optimize and you will discover the extra copy operations – Dime 2/1, 2020 at 9:10

If I understand your problem correctly, you want to detect cases where a copy operation (constructor or assignment operator) is invoked on an object following by its destruction. For custom classes, I can imagine adding some debug flag set when a copy is performed, reset in all other operations, and check in destructor. However, don't know how to do the same for non-custom classes unless you are able to modify their source code. – Azotize 2/1, 2020 at 11:4

The technique I use to find spurious copies is to temporarily make the copy constructor private, and then examine where the compiler balks because of access restrictions. (Same objective can be achieved by tagging the copy constructor as deprecated, for compilers that support such tagging.) – Reade 2/1, 2020 at 14:19

I do not disagree with what the compiler produced (although it was surprising to me). But the addition of the std::move was a simple operation that made the code better and I would have thought that there should be code analyzer for detecting this. – Mandamandaean 2/1, 2020 at 17:29

Not aware of another analyzer tool that already catches this specific case, I might look into writing or extending a clang-tidy "fixit" module to do it. I understand this provides a somewhat clear API to the syntactic and semantic elements, and is intended as a somewhat easy extension point, though it does involve compiling your own clang. – Colorfast 3/1, 2020 at 19:11

I believe you have the correct observation but the wrong interpretation!

The copy will not occur by returning the value, because every normal clever compiler will use (N)RVO in this case. From C++17 this is mandatory, so you can't see any copy by returning a local generated vector from the function.

OK, lets play a bit with std::vector and what will happen during the construction or by filling it step by step.

First of all, lets generate a data type which makes every copy or move visible like this one:

template <typename DATA >
struct VisibleCopy
{
    private:
        DATA data;

    public:
        VisibleCopy( const DATA& data_ ): data{ data_ }
        {
            std::cout << "Construct " << data << std::endl;
        }

        VisibleCopy( const VisibleCopy& other ): data{ other.data }
        {
            std::cout << "Copy " << data << std::endl;
        }

        VisibleCopy( VisibleCopy&& other ) noexcept : data{ std::move(other.data) }
        {
            std::cout << "Move " << data << std::endl;
        }

        VisibleCopy& operator=( const VisibleCopy& other )
        {
            data = other.data;
            std::cout << "copy assign " << data << std::endl;
        }

        VisibleCopy& operator=( VisibleCopy&& other ) noexcept
        {
            data = std::move( other.data );
            std::cout << "move assign " << data << std::endl;
        }

        DATA Get() const { return data; }

};

And now lets start some experiments:

using T = std::vector< VisibleCopy<int> >;

T Get1() 
{   
    std::cout << "Start init" << std::endl;
    std::vector< VisibleCopy<int> > vec{ 1,2,3,4 };
    std::cout << "End init" << std::endl;
    return vec;
}   

T Get2()
{   
    std::cout << "Start init" << std::endl;
    std::vector< VisibleCopy<int> > vec(4,0);
    std::cout << "End init" << std::endl;
    return vec;
}

T Get3()
{
    std::cout << "Start init" << std::endl;
    std::vector< VisibleCopy<int> > vec;
    vec.emplace_back(1);
    vec.emplace_back(2);
    vec.emplace_back(3);
    vec.emplace_back(4);
    std::cout << "End init" << std::endl;

    return vec;
}

T Get4()
{
    std::cout << "Start init" << std::endl;
    std::vector< VisibleCopy<int> > vec;
    vec.reserve(4);
    vec.emplace_back(1);
    vec.emplace_back(2);
    vec.emplace_back(3);
    vec.emplace_back(4);
    std::cout << "End init" << std::endl;

    return vec;
}

int main()
{
    auto vec1 = Get1();
    auto vec2 = Get2();
    auto vec3 = Get3();
    auto vec4 = Get4();

    // All data as expected? Lets check:
    for ( auto& el: vec1 ) { std::cout << el.Get() << std::endl; }
    for ( auto& el: vec2 ) { std::cout << el.Get() << std::endl; }
    for ( auto& el: vec3 ) { std::cout << el.Get() << std::endl; }
    for ( auto& el: vec4 ) { std::cout << el.Get() << std::endl; }
}

What can we observe:

Example 1) We create a vector from a initializer list and maybe we expect that we will see 4 times construct and 4 moves. But we get 4 copies! That sounds a bit mysterious, but the reason is the implementation of initializer list! Simply it is not allowed to move from the list as the iterator from the list is a const T* which makes it impossible to move elements from it. A detailed answer on this topic can be found here: initializer_list and move semantics

Example 2) In this case, we get a initial construction and 4 copies of the value. That is nothing special and is what we can expect.

Example 3) Also here, we the the construction and some moves as expected. With my stl implementation the vector grows by factor 2 every time. So we see a first construct, another one and because the vector resizes from 1 to 2, we see the move of the first element. While adding the 3 one, we see a resize from 2 to 4 which needs a move of the first two elements. All as expected!

Example 4) Now we reserve space and fill later. Now we have no copy and no move anymore!

In all cases, we do not see any move nor copy by returning the vector back to the caller at all! (N)RVO is taking place and no further action is required in this step!

Back to your question:

"How to find C++ spurious copy operations"

As seen above, you may introduce a proxy class in between for debugging purpose.

Making the copy-ctor private may not work in many cases, as you may have some wanted copies and some hidden ones. As above, only the code for example 4 will work with a private copy-ctor! And I can not answer the question, if the example 4 is the fastest one, as we fill peace by peace.

Sorry that I can not offer a general solution for finding "unwanted" copies here. Even if you dig your code for calls of memcpy, you will not find all as also memcpy will be optimized away and you see directly some assembler instructions doing the job without a call to your library memcpy function.

My hint is not to focus on such a minor problem. If you have real performance issues, take a profiler and measure. There are so many potential performance killers, that investing much time on spurious memcpy usage seems not such a worthwhile idea.

Faires answered 4/1, 2020 at 22:52 Comment(2)

My question is kind of academical. Yes, there are plenty of ways to have slow code and this is not an immediate problem for me. However, we can find the memcpy operations by using the compiler explorer. So, there is definitely a way. But it is feasible only for small programs. My point is that there is interest of code that would find suggestions on how to improve code. There are code analyzers that find bugs and memory leaks, why not such problems? – Mandamandaean 5/1, 2020 at 12:43

"code that would find suggestions on how to improve code." That is already done and implemented in the compilers itself.(N)RVO optimization is only a single example and working perfect as shown above. Catching memcpy did not help as you are searching for "unwanted memcpy". "There are code analyzers that find bugs and memory leaks, why not such problems?" Maybe it is not a (common) problem. And much more general tool to find "speed" problems is also already present: profiler! My personal feeling is, that your are searching for a academic thing which is not a problem in real software today. – Faires 5/1, 2020 at 15:17

I know that a copy operation occurred because I used compiler explorer and it shows a call to memcpy.

Did you put your complete application into the compiler explorer, and did you enable optimizations? If not, then what you saw in the compiler explorer might or might not be what is happening with your application.

One issue with the code you posted is that you first create a std::vector, and then copy it into an instance of data. It would be better to initialize data with the vector:

data get_vector(int n)
{
  return {std::vector<int> V(n,0)};
}

Also, if you just give the compiler explorer the definition of data and get_vector(), and nothing else, it has to expect the worse. If you actually give it some source code that uses get_vector(), then look at what assembly is generated for that source code. See this example for what the above modification plus actual usage plus compiler optimizations can cause the compiler to produce.

Tshombe answered 4/1, 2020 at 20:28 Comment(5)

I only put in computer explorer the above code (that has the memcpy) otherwise the question would not make sense. That being said your answer is excellent in showing different ways to produce better code. You provide two ways: Use of static and putting the constructor directly in the output. So, those ways could be suggested by a code analyzer. – Mandamandaean 5/1, 2020 at 12:48

@MathieuDutourSikiric That get_vector() is static does not change the code generated for main which does not call get_vector() at all! The static only makes the assembler output shorter because no code at all is produced for get_vector() (since it cannot be used from other translation units). – Boloney 14/1 at 7:14

Question: My assembler reading is a bit shaky, but I seem to see that the return value for main is simply zero, that is, the compiler is smart enough to see that outcome and simply optimize away all vector access. I am a bit surprised that it still goes through the motion of allocating and de-allocating the vector... – Boloney 14/1 at 7:17

@Peter-ReinstateMonica This is because memory allocation is not deterministic (it might fail), has side effects, and you can replace the default memory allocator with something else. So the compiler might not be able to tell it can elide the memory allocation. I think only Clang does that in some cases. – Tshombe 14/1 at 10:45

@G.Sliepen Well, in the general case that is surely correct; but because the compiler sees the entire program in this particular case, it can tell there is no allocator, no side effects etc. Admittedly it cannot know whether we have more than 11 bytes of memory, so maybe that is the reason (I'm not joking). – Boloney 14/1 at 10:55

Are there linter or code analyzer that would detect such spurious copy operations? Neither cppcheck, cpplint, nor clang-tidy can do it.

Not sure whether it will detect a problem with the code in your post, but if you enable the IntelliSense code linter for C++ in Visual Studio, then the lnt-accidental-copy check can warn you about similar problems.

Hipparch answered 14/1 at 5:33 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags