I have some C++ code (written by someone else) which appears to be calling the wrong function. Here's the situation:
UTF8InputStreamFromBuffer* cstream = foo();
wstring fn = L"foo";
DocumentReader* reader;
if (a_condition_true_for_some_files_false_for_others) {
reader = (DocumentReader*) _new GoodDocumentReader();
} else {
reader = (DocumentReader*) _new BadDocumentReader();
}
// the crash happens inside the following call
// when a BadDocumentReader is used
doc = reader->readDocument(*cstream, fn);
The files for which the condition is true are processed fine; the ones for which it is false crash. The class hierarchy for DocumentReader looks like this:
class GenericDocumentReader {
virtual Document* readDocument(InputStream &strm, const wchar_t * filename) = 0;
}
class DocumentReader : public GenericDocumentReader {
virtual Document* readDocument(InputStream &strm, const wchar_t * filename) {
// some stuff
}
};
class GoodDocumentReader : public DocumentReader {
Document* readDocument(InputStream & strm, const wchar_t * filename);
}
class BadDocumentReader : public DocumentReader {
virtual Document* readDocument(InputStream &stream, const wchar_t * filename);
virtual Document* readDocument(const LocatedString *source, const wchar_t * filename);
virtual Document* readDocument(const LocatedString *source, const wchar_t * filename, Symbol inputType);
}
The following are also relevant:
class UTF8InputStreamFromBuffer : public wistringstream {
// foo
};
typedef std::basic_istream<wchar_t> InputStream;
Running in a Visual C++ debugger, it shows that the readDocument call on a BadDocumentReader is calling not
readDocument(InputStream&, const wchar_t*)
but rather
readDocument(const LocatedString* source, const wchar_t *, Symbol)
This is confirmed by sticking cout statements in all the readDocuments. After the call the source argument is of course full of garbage, which shortly causes a crash. LocatedString does have a one-argument implicit constructor from InputStream, but checking with a cout shows it's not getting called. Any idea what could explain this?
Edit: other possibly relevant details: The DocumentReader classes are in a different library than the calling code. I also have done a complete rebuild of all the code and the problem remained.
Edit 2: I'm using Visual C++ 2008.
Edit 3: I tried making a "minimally compilable example" with the same behavior, but was unable to replicate the problem.
Edit 4:
At Billy ONeal's suggestion, I tried changing the order of the readDocument methods in the BadDocumentReader header. Sure enough, when I change the order, it changes which of the functions gets called. This seems to me to confirm my suspicion there's something weird going on with indexing into the vtable, but I'm not sure what's causing it.
Edit 5: Here's the disassembly for the few lines before the function call:
00559728 mov edx,dword ptr [reader]
0055972E mov eax,dword ptr [edx]
00559730 mov ecx,dword ptr [reader]
00559736 mov edx,dword ptr [eax]
00559738 call edx
I don't know much assembly, but it looks to me like it's dereferencing the reader variable pointer. The first thing stored in this part of memory should be the pointer to the vtable, so it dereferences that into eax. Then it places the first thing in the vtable in edx and calls it. Recompiling with different orders of the methods doesn't seem to change this. It always wants to call the first thing in the vtable. (I could have completely misunderstood this, having no knowledge of assembly at all...)
Thanks for your help.
Edit 6: I found the problem, and I apologize for wasting everyone's time. The problem was that GoodDocumentReader was supposed to be declared as a subclass of DocumentReader, but in fact was not. The C-style casts were suppressing the compiler error (should have listened to you, @sellibitze, If you'd like to submit your comment as an answer, I'll mark it as correct). The tricky thing is that the code had been working for several months by pure accident until a revision when someone added two more virtual functions to GoodDocumentReader so it was no longer calling the right function by luck.
DefaultDocumentReader
? – Handmadewistringstream
. If you want UTF-8, you should do that with a codecvt facet; that is, after all, what codecvt facets are designed for. boost has one, and I believe MSVC++ comes with one. – Gannreinterpret_cast
and you will get undefined behavior --that might look like calling the wrong function... – Gunzburg