C++ calling completely wrong (virtual) method of an object
Asked Answered
D

4

16

I have some C++ code (written by someone else) which appears to be calling the wrong function. Here's the situation:

UTF8InputStreamFromBuffer* cstream = foo();
wstring fn = L"foo";
DocumentReader* reader;

if (a_condition_true_for_some_files_false_for_others) {
    reader = (DocumentReader*) _new GoodDocumentReader();
} else {
    reader = (DocumentReader*) _new BadDocumentReader();
}

// the crash happens inside the following call
// when a BadDocumentReader is used
doc = reader->readDocument(*cstream, fn);

The files for which the condition is true are processed fine; the ones for which it is false crash. The class hierarchy for DocumentReader looks like this:

class GenericDocumentReader {
    virtual Document* readDocument(InputStream &strm, const wchar_t * filename) = 0;
}

class DocumentReader : public GenericDocumentReader {
    virtual Document* readDocument(InputStream &strm, const wchar_t * filename) {
        // some stuff
    }
};

class GoodDocumentReader : public DocumentReader {
    Document* readDocument(InputStream & strm, const wchar_t * filename);
}

class BadDocumentReader : public DocumentReader {
    virtual Document* readDocument(InputStream &stream, const wchar_t * filename);
    virtual Document* readDocument(const LocatedString *source, const wchar_t * filename);
    virtual Document* readDocument(const LocatedString *source, const wchar_t * filename, Symbol inputType);
}

The following are also relevant:

class UTF8InputStreamFromBuffer : public wistringstream {
    // foo
};
typedef std::basic_istream<wchar_t> InputStream;

Running in a Visual C++ debugger, it shows that the readDocument call on a BadDocumentReader is calling not

readDocument(InputStream&, const wchar_t*)

but rather

readDocument(const LocatedString* source, const wchar_t *, Symbol)

This is confirmed by sticking cout statements in all the readDocuments. After the call the source argument is of course full of garbage, which shortly causes a crash. LocatedString does have a one-argument implicit constructor from InputStream, but checking with a cout shows it's not getting called. Any idea what could explain this?

Edit: other possibly relevant details: The DocumentReader classes are in a different library than the calling code. I also have done a complete rebuild of all the code and the problem remained.

Edit 2: I'm using Visual C++ 2008.

Edit 3: I tried making a "minimally compilable example" with the same behavior, but was unable to replicate the problem.

Edit 4:

At Billy ONeal's suggestion, I tried changing the order of the readDocument methods in the BadDocumentReader header. Sure enough, when I change the order, it changes which of the functions gets called. This seems to me to confirm my suspicion there's something weird going on with indexing into the vtable, but I'm not sure what's causing it.

Edit 5: Here's the disassembly for the few lines before the function call:

00559728  mov         edx,dword ptr [reader] 
0055972E  mov         eax,dword ptr [edx] 
00559730  mov         ecx,dword ptr [reader] 
00559736  mov         edx,dword ptr [eax] 
00559738  call        edx  

I don't know much assembly, but it looks to me like it's dereferencing the reader variable pointer. The first thing stored in this part of memory should be the pointer to the vtable, so it dereferences that into eax. Then it places the first thing in the vtable in edx and calls it. Recompiling with different orders of the methods doesn't seem to change this. It always wants to call the first thing in the vtable. (I could have completely misunderstood this, having no knowledge of assembly at all...)

Thanks for your help.

Edit 6: I found the problem, and I apologize for wasting everyone's time. The problem was that GoodDocumentReader was supposed to be declared as a subclass of DocumentReader, but in fact was not. The C-style casts were suppressing the compiler error (should have listened to you, @sellibitze, If you'd like to submit your comment as an answer, I'll mark it as correct). The tricky thing is that the code had been working for several months by pure accident until a revision when someone added two more virtual functions to GoodDocumentReader so it was no longer calling the right function by luck.

Draught answered 25/1, 2011 at 21:20 Comment(21)
What is a DefaultDocumentReader?Handmade
Could you cut this down to a minimable compilable example that demonstrates the issue?Handmade
Oli: Sorry, that should be "BadDocumentReader". I've updated the post.Draught
Oli: I tried that making a small example with an exactly parallel set of classes, etc. and I couldn't get the crash to happen. I'll note that in the question.Draught
One should never inherit from wistringstream. If you want UTF-8, you should do that with a codecvt facet; that is, after all, what codecvt facets are designed for. boost has one, and I believe MSVC++ comes with one.Gann
You said that the DocumentReader stuff was in a separate library. Are you sure they're compiled with the same settings and such?Gann
Billy: Thanks, that's useful to know. Unfortunately, I this case I didn't write the library, so I'm stuck with it.Draught
@rmg: Sounds like it's time to warm up a club and go to work on the head of the library author :)Gann
Billy: As far as I can tell from looking at the Visual Studio property pages, yes.Draught
@rmg: If you change the order of the items in the class declaration and recompile both your library and the called library, does the function called change? Does the behavior change between debug and release modes?Gann
@billy: The behavior is the same in Release mode. I'll check on changing the order in the class declaration.Draught
Possibly the "_new" allocation thing is broken (since it's not standard new)??Pika
@itjax: the _new is just macro'd to regular new .Draught
@billy: Good idea about switching the order. I'm going to update the post in a moment with the results.Draught
Possibly due to mismatched definition in the two binaries (e.g. one is compiled as debug while the other as retail) and some virutal function in the base class is #ifdef'ed out.Elis
Have you tried commenting out or renaming the readDocument(const LocatedString *source, const wchar_t * filename) and readDocument(const LocatedString *source, const wchar_t * filename, Symbol inputType)? Its possible that these overloads are hiding the readDocument(InputStream &stream, const wchar_t * filename) function.Cryohydrate
Didn't anybody tell you that a C-style casts are evil? If it doesn't compile without it, there is something seriously wrong with the code. It looks like some Java developer tried to write Java in C++.Lathan
@sellibize: I agree C-style casts are evil. But I didn't write the code, and I don't think that's the problem.Draught
Also, just a small point, you don't want to cast your document reader objects to the base type, you need to do this: reader = new GoodDocumentReader()Cryohydrate
@rmg, as @Lathan says, casts are not a good idea. Whether it is an issue or not, I would remove the cast straight away. It should not be needed if the constructed object is of the appropriate pointer type, and as it is, it will perform reinterpret_cast and you will get undefined behavior --that might look like calling the wrong function...Gunzburg
Problem solved (see edit to post). Thanks to everyone who suggested removing the bad casts.Draught
M
3

I would try removing the C-cast first.

  • It is completely unnecessary, casts from a Derived to a Base is natural in the language
  • It may, actually, cause a bug (though it's not supposed to)

It looks like a compiler bug... it would certainly not be the first either in VS.

I unfortunately don't have VS 2008 at hand, in gcc the casts occur correctly:

struct Base1
{
  virtual void foo() {}
};

struct Base2
{
  virtual void bar() {}
};

struct Derived: Base1, Base2
{
};

int main(int argc, char* argv[])
{
  Derived d;
  Base1* b1 = (Base1*) &d;
  Base2* b2 = (Base2*) &d;

  std::cout << "Derived: " << &d << ", Base1: " << b1
                                 << ", Base2: " << b2 << "\n";

  return 0;
}


> Derived: 0x7ffff1377e00, Base1: 0x7ffff1377e00, Base2: 0x7ffff1377e08
Macrobiotics answered 26/1, 2011 at 7:37 Comment(0)
R
15

This is happening because different source files disagree on the vtable layout of the class. The code calling the function thinks that readDocument(InputStream &, const wchar_t *) is at a particular offset, whilst the actual vtable has it at a different offset.

This usually occurs when you change the vtable, such as by adding or removing a virtual method in that class or any of its parent classes, and then you recompile one source file but not another source file. Then, you get incompatible object files, and when you link them, things go boom.

To fix this, do a full clean and rebuild of all of your code: both the library code and your code that uses the library. If you don't have the source code to the library but you do have the header files for it with the class definitions, then that is not an option. In that case, you cannot modify the class definition -- you should revert it to how it was given to you and recompile all of your code.

Raven answered 25/1, 2011 at 23:17 Comment(4)
+1. The .h file defines the layout of the object, including the vtable; if you change it without recompiling the library, everything will be really messed up.Billings
In theory, this is a problem if the sources are compiled with either different compiler, different version or different settings. But in practice, the compilers/options have to be quite different in order for this issue to come about (not to mention that the build system is pretty bad if it does not detect that a recompilation is needed). I think it is a problem with the C-style cast that doesn't shift the pointer value correctly (shift it back from the derived class' vtable to the base class' vtable).Frenulum
@Mikael Persson: I've ran into very similar problems before where the class layout was changed but not every source file that used it was recompiled. The reason that happened was because the compiler flags being used were something like -I- -Isomedir, so when the dependencies were generated with -MM, the headers in somedir were not listed as dependencies.Raven
Came here through Google Search. The problem was as the OP describes, the cause was a Visual Studio "smart" partial rebuild after adding virtual functions. Cleaning the solution and rebuilding fixed it.Screwworm
M
3

I would try removing the C-cast first.

  • It is completely unnecessary, casts from a Derived to a Base is natural in the language
  • It may, actually, cause a bug (though it's not supposed to)

It looks like a compiler bug... it would certainly not be the first either in VS.

I unfortunately don't have VS 2008 at hand, in gcc the casts occur correctly:

struct Base1
{
  virtual void foo() {}
};

struct Base2
{
  virtual void bar() {}
};

struct Derived: Base1, Base2
{
};

int main(int argc, char* argv[])
{
  Derived d;
  Base1* b1 = (Base1*) &d;
  Base2* b2 = (Base2*) &d;

  std::cout << "Derived: " << &d << ", Base1: " << b1
                                 << ", Base2: " << b2 << "\n";

  return 0;
}


> Derived: 0x7ffff1377e00, Base1: 0x7ffff1377e00, Base2: 0x7ffff1377e08
Macrobiotics answered 26/1, 2011 at 7:37 Comment(0)
F
0

Based on the assembly, it seems pretty clear that the binding is dynamic and from the first entry of the vtable. The question is which virtual table!?! I would suggest you use a static_cast instead of a C-style cast (of course, @VJo: dynamic_cast is not needed in this case!). There is nothing in the standard that requires a pointer BadDocumentReader* ptr to have the same actual value (address) as its cast static_cast<DocumentReader*>(ptr). This would explain why it binds the call to the first entry of the vtable of BadDocumentReader and not to the vtable of its base class. And, btw, you shouldn't need a cast at all in this case.

One possibility that doesn't really agree with the asm, but still good to know. Because you create the BadDocumentReader in the same scope as you are calling the reader->readDocument, the compiler gets a little too clever and decides that it can resolve the call without having to look it up in the vtable dynamically. This is because it knows that the "real" type of the reader pointer is actually BadDocumentReader. So it bipasses the vtable and binds the call statically. At least, that is one possibility which I had happen to me in an almost identical situation. Based on the asm, however, I'm pretty sure the first possibility is the one occurring in your case.

Frenulum answered 26/1, 2011 at 1:45 Comment(1)
The C style cast should try static_cast first, so I don't think that's what's going on here.Gann
P
0

I had this issue and the problem for me was that I was storing it in a class member variable. When I changed it to a pointer and involved new/delete, it successfully registered the child class and its function.

Phenacaine answered 24/2, 2017 at 1:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.