Consider the toy example code below. It has a too-large-to-copy data structure MyDataStructure
and a little iterator class MyDataIterator
that the caller can instantiate to iterate over the data in the structure.
In the toy version of the code, the data structure contains just 10 integers (1000-1009) and the WORKING CODE
section in main()
iterates over them correctly, as expected.
However, the second (BROKEN CODE
) section in main()
doesn't iterate over them correctly; it invokes undefined behavior because it calls GetDataStructureByValue()
instead of GetDataStructureByReference()
, which means that the MyDataStructure
reference that iter
stores in its _ds
private member-variable becomes a dangling reference before the first loop-iteration.
This is a bit of an insidious problem, because in non-toy code it's not always easy to tell/remember from the name of a function whether it is returning by-value or by-reference, but the former case results in a non-obvious runtime bug, while the latter case works fine.
My question is, is there any reasonable way to get a modern C++ compiler to error-out or at least warn about the problematic case? I suspect there isn't, but I'm asking anyway because I'd feel more secure if instances of this problem could be automatically flagged for me somehow.
The source code of the toy program is below, followed by an example of its output on my Mac running XCode:
#include <stdio.h>
class MyDataStructure
{
public:
MyDataStructure()
{
printf("MyDataStructure CTOR %p\n", this);
for (int i=0; i<10; i++) _data[i] = i+1000;
}
~MyDataStructure()
{
printf("MyDataStructure DTOR %p\n", this);
for (int i=0; i<10; i++) _data[i] = -1; // just to make the symptoms more obvious
}
bool IsPositionValid(int pos) const {return ((pos >= 0)&&(pos < 10));}
int GetValueAt(int pos) const {return _data[pos];}
private:
int _data[10];
};
const MyDataStructure & GetDataStructureByReference()
{
static MyDataStructure _ds;
return _ds;
}
MyDataStructure GetDataStructureByValue()
{
MyDataStructure _ds;
return _ds;
}
class MyDataIterator
{
public:
MyDataIterator(const MyDataStructure & ds) : _ds(ds), _idx(0) {/* empty */}
bool HasValue() const {return _ds.IsPositionValid(_idx);}
int GetValue() const {return _ds.GetValueAt(_idx);}
void operator++(int) {_idx++;}
const MyDataStructure & GetDataStructure() const {return _ds;}
private:
const MyDataStructure & _ds;
int _idx;
};
int main(int, char **)
{
{
// The following line of code is okay; it counts from 1000 to 10009, as expected
printf("WORKING CODE:\n");
for (MyDataIterator iter(GetDataStructureByReference()); iter.HasValue(); iter++) printf("iter returned %i from %p\n", iter.GetValue(), &iter.GetDataStructure());
}
{
// The following line of code is buggy because, the MyDataStructure object that (iter) keeps a reference to
// gets destroyed before the first iteration of the loop.
printf("\nBROKEN CODE:\n");
for (MyDataIterator iter(GetDataStructureByValue()); iter.HasValue(); iter++) printf("iter returned %i from %p\n", iter.GetValue(), &iter.GetDataStructure());
}
printf("\nEND OF PROGRAM\n");
return 0;
}
... example output:
$ ./a.out
WORKING CODE:
MyDataStructure CTOR 0x1015c4000
iter returned 1000 from 0x1015c4000
iter returned 1001 from 0x1015c4000
iter returned 1002 from 0x1015c4000
iter returned 1003 from 0x1015c4000
iter returned 1004 from 0x1015c4000
iter returned 1005 from 0x1015c4000
iter returned 1006 from 0x1015c4000
iter returned 1007 from 0x1015c4000
iter returned 1008 from 0x1015c4000
iter returned 1009 from 0x1015c4000
BROKEN CODE:
MyDataStructure CTOR 0x7ff7be93d198
MyDataStructure DTOR 0x7ff7be93d198
iter returned -1 from 0x7ff7be93d198
iter returned -1 from 0x7ff7be93d198
iter returned -1 from 0x7ff7be93d198
iter returned -1 from 0x7ff7be93d198
iter returned -1 from 0x7ff7be93d198
iter returned -1 from 0x7ff7be93d198
iter returned -1 from 0x7ff7be93d198
iter returned -1 from 0x7ff7be93d198
iter returned -1 from 0x7ff7be93d198
iter returned -1 from 0x7ff7be93d198
END OF PROGRAM
MyDataStructure DTOR 0x1015c4000
-Weffc++
what you want? – BalburMyDataStructure
, the obvious approach is to preventMyDataStructure
from being copyable. Before C++11, declare its copy constructor asprivate
and don't define it. After C++11, delete (viaMyDataStructure(const MyDataStructure &) = delete
). Either way, the effect will be a diagnostic when definingGetDataStructureByValue()
from compiling (if it is not afriend
or a member ofMyDataStructure
). – SpinthariscopeMyDataStructure
entirely though -- there are times when it's reasonable to do, my phrasing in the question notwithstanding -- I only want to avoid the overhead of doing so when it isn't necessary (e.g. when iterating over one) – Birdsong