Comparing polymorphic types in c++20
Asked Answered
H

4

17

I have code that is somewhere between c++17 and c++20. Specifically, we have c++20 enabled on GCC-9 and clang-9, where it is only partially implemented.

In code we have quite big hierarchy of polymorphic types like this:

struct Identifier {
    virtual bool operator==(const Identifier&other) const = 0;
};

struct UserIdentifier : public Identifier {
    int userId =0;
    bool operator==(const Identifier&other) const override {
        const UserIdentifier *otherUser = dynamic_cast<const UserIdentifier*>(&other);
        return otherUser && otherUser->userId == userId;
    }
};

struct MachineIdentifier : public Identifier {
    int machineId =0;
    bool operator==(const Identifier&other) const override {
        const MachineIdentifier *otherMachine = dynamic_cast<const MachineIdentifier*>(&other);
        return otherMachine && otherMachine->machineId == machineId;
    }
};

int main() {
    UserIdentifier user;
    MachineIdentifier machine;
    return user==machine? 1: 0;
}

https://godbolt.org/z/er4fsK

We are now migrating to GCC-10 and clang-10, but because of reasons we still need to work on versions 9 (well, at least clang-9 as this is what android NDK currently has).

The above code stops compiling because new rules about comparison operators are implemented. Reversible operator== causes ambiguities. I can't use a spaceship operator because it is not implemented in versions 9. But I omitted this from the example - I assume that whatever works with == will work with other operators.

So: What is the recommended approach to implementing comparison operators in c++20 with polymorphic types?

Heck answered 26/10, 2020 at 21:3 Comment(7)
Are you sure about all those dynamic_cast usages? It's kinda the complete opposite of polymorphism IMO. Plus, it yields an additional runtime impact. Surely there's a way to avoid RTTI here.Frisco
My concern with this approach is that a == b may not be equivalent to b == a. ExampleNa
I hear you, I don't like this design to much as well. @FrançoisAndrieux every type may compare true only with other of the same type. I think it should be equivalent as long as only leaves of hierarchy have concrete operator implementations and they are all behaving as in example.Heck
@Frisco I am hoping for answers with better design, not only immediate solution (as dfrib kindly given) :-)Heck
@Heck In that case adding final to leaf classes along with a comment to that efcect might help avoid mistakes.Na
Are these really polymorphic? Do you have vectors of Identifiers of which you don't know the concrete type? Or is Identifier just a repository for common code? This seems like an decent application for the curiously recurring template patternSpringlet
Do you have a situation with SuperMachineIdentifier inheriting from MachineIdentifier and adding a new "superId" field, and if so how does comparison work between an instance of each?Krum
P
19

As an intermediate solution you could re-factor your polymorphic equality operator== to a non-virtual operator== defined in the base class, which polymorphically dispatches to a non-operator virtual member function:

struct Identifier {    
    bool operator==(const Identifier& other) const {
        return isEqual(other);
    }
private:
    virtual bool isEqual(const Identifier& other) const = 0;
};

// Note: do not derive this class further (less dyncasts may logically fail).
struct UserIdentifier final : public Identifier {
    int userId = 0;
private:
    virtual bool isEqual(const Identifier& other) const override {
        const UserIdentifier *otherUser = dynamic_cast<const UserIdentifier*>(&other);
        return otherUser && otherUser->userId == userId;
    }
};

// Note: do not derive this class further (less dyncasts may logically fail).
struct MachineIdentifier final : public Identifier {
    int machineId = 0;
private:
    virtual bool isEqual(const Identifier& other) const override {
        const MachineIdentifier *otherMachine = dynamic_cast<const MachineIdentifier*>(&other);
        return otherMachine && otherMachine->machineId == machineId;
    }
};

There will now no longer be an ambiguity as dispatch on the isEqual virtual member function will always be done on the left hand side argument to operator==.

const bool result = (user == machine);  // user.isEqual(machine);
Paul answered 26/10, 2020 at 21:13 Comment(1)
Note that this method could be extended to use double dispatch, which could eliminate the need for dynamic_cast.Amidase
F
1

OK, I see it wasn't mentioned in the answer given by @dfrib, so I'll extend that answer to show it.

You could add an abstract (pure virtual) function in the Identifier structure, which returns its "identity".

Then, in each structure which extends the Identifier structure, you could call that function instead of dynamically casting the input object and check that its type is matching the this object.

Of course, you will have to ensure complete distinguish between the set of identities of each structure. In other words, any two sets of identities must not share any common values (i.e., the two sets must be disjoint).

This will allow you to completely get rid of RTTI, which is pretty much the complete opposite of polymorphism IMO, and also yields an additional runtime impact on top of that.

Here is the extension of that answer:

struct Identifier {    
    bool operator==(const Identifier& other) const {
        return getVal() == other.getVal();
    }
private:
    virtual int getVal() const = 0;
};

struct UserIdentifier : public Identifier {
private:
    int userId = 0;
    virtual int getVal() const override {
        return userId;
    }
};

struct MachineIdentifier : public Identifier {
private:
    int machineId = 100;
    virtual int getVal() const override {
        return machineId;
    }
};

If you want to support a structure with identifiers other some type other than int, then you can extend this solution to use templates.

Alternatively to enforcing a different set of identities for each structure, you could add a type field, and ensure that only this field is unique across different structures.

In essence, those types would be the equivalent of the dynamic_cast check, which compares between the pointer of the input object's V-table and the pointer of the input structure's V-table (hence my opinion of this approach being the complete opposite of polymorphism).

Here is the revised answer:

struct Identifier {    
    bool operator==(const Identifier& other) const {
        return getType() == other.getType() && getVal() == other.getVal();
    }
private:
    virtual int getType() const = 0;
    virtual int getVal() const = 0;
};

struct UserIdentifier : public Identifier {
private:
    int userId = 0;
    virtual int getType() const override {
        return 1;
    virtual int getVal() const override {
        return userId;
    }
};

struct MachineIdentifier : public Identifier {
private:
    int machineId = 0;
    virtual int getType() const override {
        return 2;
    virtual int getVal() const override {
        return machineId;
    }
};
Frisco answered 27/10, 2020 at 9:10 Comment(7)
This will come with the likely unintentional effect that a UserIdentifier will compare equal to a MachineIdentifier if the value of they userId and machineId matches.Paul
@dfrib: That's why I explicitly stated: "Of course, you will have to ensure complete distinguish between the set of identities of each structure. In other words, any two sets of identities must not share any common values (i.e., the two sets must be disjoint).".Frisco
In your example even the default identities are set to the same value, so there is no overlap mechanism in place (which may confuse readers). I think your answer covers a generally good alternative approach, but you might want to split up a the logic for id's and identifier tags, where the latter would uniquely relate to a kind of type, whether the former is just an id that may overlap between different kinds of ids.Paul
@dfrib: Ah yes, that's a copy paste mistake. BTW, you could also add an additional type field, and allow yourself to use the same identities in different structures, so long as the types are unique. In essense, those types would be the equivalent of the dynamic_cast check, which compares the pointer of the input object's V-table and the pointer of the input type's V-table.Frisco
I agree, I might update/expand the answer later to show an alternative to the dyncast using identifier-unique tags (currently in meetings, so for later).Paul
@dfrib: Note that this approach, just like the approach of dynamic_cast, is essentially a "violation" of polymorphism.Frisco
No, you can't "extend this solution to use templates", because virtual functions can't be templates. In order to get a type-specific equality test you have to ensure that both objects are the same type. That is in no way a "violation" of polymorphism.Saltworks
S
1

This does not look like a problem of polymorphism. Actually, I think that there is any polymorphism at all is a symptom of a data model error.

If you have values that identify machines, and values that identify users, and these identifiers are not interchangeable¹, they should not share a supertype. The property of "being an identifier" is a fact about how the type is used in the data model to identify values of another type. A MachineIdentifier is an identifier because it identifies a machine; a UserIdentifier is an identifier because it identifies a user. But an Identifier is in fact not an identifier, because it doesn't identify anything! It is a broken abstraction.

A more intuitive way to put this might be: the type is the only thing that makes an identifier meaningful. You cannot do anything with a bare Identifier, unless you first downcast it to MachineIdentifier or UserIdentifier. So having an Identifier class is most likely wrong, and comparing a MachineIdentifier to a UserIdentifier is a type error that should be detected by the compiler.

It seems to me the most likely reason Identifier exists is because someone realized that there was common code between MachineIdentifier and UserIdentifier, and leapt to the conclusion that the common behavior should be extracted to an Identifier base type, with the specific types inheriting from it. This is an understandable mistake for anyone who has learned in school that "inheritance enables code reuse" and has not yet realized that there are other kinds of code reuse.

What should they have written instead? How about a template? Template instantiations are not subtypes of the template or of each other. If you have types Machine and User that these identifiers represent, you might try writing a template Identifier struct and specializing it, instead of subclassing it:

template <typename T>
struct Identifier {};

template <>
struct Identifier<User> {
  int userId = 0;
  bool operator==(const Identifier<User> &other) const {
    return other.userId == userId;
  }
};

template <>
struct Identifier<Machine> {
  int machineId = 0;
  bool operator==(const Identifier<Machine> &other) const {
    return other.machineId == machineId;
  }
};

This probably makes the most sense when you can move all the data and behavior into the template and thus not need to specialize. Otherwise, this is not necessarily the best option because you cannot specify that Identifier instantiations must implement operator==. I think there may be a way to achieve that, or something similar, using C++20 concepts, but instead, let's combine templates with inheritance to get some advantages of both:

template <typename Id>
struct Identifier {
  virtual bool operator==(const Id &other) const = 0;
};

struct UserIdentifier : public Identifier<UserIdentifier> {
  int userId = 0;
  bool operator==(const UserIdentifier &other) const override {
    return other.userId == userId;
  }
};

struct MachineIdentifier : public Identifier<MachineIdentifier> {
  int machineId = 0;
  bool operator==(const MachineIdentifier &other) const override {
    return other.machineId == machineId;
  }
};

Now, comparing a MachineIdentifier to a UserIdentifier is a compile time error.

This technique is called the curiously recurring template pattern (also see ). It is somewhat baffling when you first come across it, but what it gives you is the ability to refer to the specific subclass type in the superclass (in this example, as Id). It might also be a good option for you because, compared to most other options, it requires relatively few changes to code that already correctly uses MachineIdentifier and UserIdentifier.


¹ If the identifiers are interchangeable, then most of this answer (and most of the other answers) probably does not apply. But if that is the case, it should also be possible to compare them without downcasting.

Springlet answered 27/10, 2020 at 23:23 Comment(1)
Yes, I agree that your diagnosis about code reuse vs polymorphism is probably spot on. Sadly hierarchy is to big to change right now. Regarding last part of your answer - this seems cool en.wikipedia.org/wiki/Barton%E2%80%93Nackman_trick.Heck
3
0

You don't have any polymorphism in your code. You can force a dynamic binding of the comparison operator function (polymorphism) by using either Identifier pointers or references.

For example, instead of

UserIdentifier user;
MachineIdentifier machine;
return user==machine? 1: 0;

With references you could do:

UserIdentifier user;
MachineIdentifier machine;
Identifier &iUser = user;

return iUser == machine ? 1: 0;

Conversely, you can explicitly call UserIdentifier's comparison operator:

return user.operator==(machine) ? 1: 0;
3d answered 26/10, 2020 at 21:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.