What is idiomatic modern C++ for algebraic data types?

Asked 29/3, 2016 at 11:49 Answered 29/3, 2016 at 12:25

c++polymorphism unions variant algebraic-data-types

Suppose, for example, you want to implement a spreadsheet Cell in C++. A cell can be either a string, a number, or perhaps empty. Ignore other cases, like it being a formula.

In Haskell, you might do something like:

data Cell = CellStr String | CellDbl Double | None

What is considered the current "best practice" for doing it in C++? Use a union in a structure with a type indicator, or something else?

Quick answered 29/3, 2016 at 11:49 Comment(9)

One possible option is boost::variant. – Estivation 29/3, 2016 at 11:55

or implement yourself specific variant type with union – Provision 29/3, 2016 at 12:4

@Estivation make that an answer and I'll +1 it. – Diakinesis 29/3, 2016 at 12:22

I would go with a sorted vector<pair<XYCoords, double>> doubles; and a sorted vector<pair<XYCoords, string>> strings;. For a given cell coordinate you lower_bound into the doubles, if you didn't find it you do the same for the strings, otherwise it is None. Drawing the screen should be very fast, you just iterate through the vectors. Calculations are a bit messy, because they depend on the type, but you can probably abstract that away. Effectively I just cheated and never combined different types into one. Anyway, the question is too broad and opinionated. – Vetavetch 29/3, 2016 at 12:26

@MvG Unfortunately, the highlighting code for haskell is lang-hs instead of lang-haskell. Keep this in mind the next time you want to add highlighting of Haskell code. – Outermost 29/3, 2016 at 15:7

@Bakuriu: Thanks, and sorry for the mistake. I thought that the fact that it changed highlighting to something other than the C++ default was indication enough that I had it right, although it did look a bit strange. One more reason why having a UI for this would be a good thing… – Pinochle 29/3, 2016 at 15:14

Either tagged unions (possibly templated) or via a polymorphic base class: https://mcmap.net/q/500797/-how-to-check-data-type-in-c for sketches of both approaches. – Addie 29/3, 2016 at 22:57

I can roll out a manual variant for you if you want. Then you don't have to use variant – Fallon 3/4, 2016 at 19:45

Link (as question has been closed): gist.github.com/czipperz/ca36868273d193b48ec7edcc84051e6e – Fallon 3/4, 2016 at 20:22

struct empty_type {};
using cell_type = boost::variant<std::string, double, empty_type>;

Then you would do something with the cell with:

boost::apply_visitor(some_visitor(), cell);

Diakinesis answered 29/3, 2016 at 12:25 Comment(16)

Also note that there is a proposal for standardising std::variant (original proposal here) – Sad 29/3, 2016 at 12:30

@Sad in my view the proposal is flawed since it seeks to mandate allowing a variant to be empty. I sincerely hope it's rejected in favour of one that models the boost variant more closely. – Diakinesis 29/3, 2016 at 12:33

The latest version does not mandate that. To use an empty state, you explicitly add the type monostate to the type list. It is true though that a variant can become invalid (not empty) under exceptional conditions. – Sad 29/3, 2016 at 12:41

@Sad it seems to me that they're mixing concerns. optional is one concern, variant is another. If the proposer wants an optional variant, he can use optional<variant<...>>. A variant should never be allowed to be invalid, even after a move - it should simply contain a moved-from T. – Diakinesis 29/3, 2016 at 12:44

"LEWG opted against introducing an explicit additional variant state, representing its invalid (and possibly empty, default constructed) state." – Sad 29/3, 2016 at 12:46

@Sad I read your blog article, and boost::variant does seem to be the best fit to the problem. Compile-time checking is good. I have resisted using Boost in the past. I think Boost will be a difficult sell for an open source project that I am interested in contributing to, though. Hmmm, something for me to think about, though. – Quick 29/3, 2016 at 13:40

@RichardHodges How would you deal with the case of assignments to the variant in the case where the copy constructor for T throws an exception? – Goidelic 29/3, 2016 at 14:38

@CortAmmon don't we already have the copy/swap idiom for types that can throw? – Diakinesis 29/3, 2016 at 14:45

@RichardHodges Yes, using extra allocations or space. From my understanding of variant, much of its value is in its performance because it doesn't need to allocate new memory and doesn't waste any more space than it has to. That's what I thought separated it from a trivial-to-implement visitor pattern. – Goidelic 29/3, 2016 at 14:51

@CortAmmon copy/swap does not need to allocate any space. The copy is an auto variable and the data is moved/swapped. Since c++11 it's extremely efficient. The tiny cost of the redundant move is outweighed by the guarantee of logical correctness baked in at compile time... or have we learned nothing in the last 20 years? – Diakinesis 29/3, 2016 at 15:48

@RichardHodges I think, in the last 20 years, we've learned that some problems are interesting enough to call for new idioms. Copy/swap cannot work here because variant is intended to operate like a union container, not a struct. The variant has no member of type T to swap with. The existing value of the variant must be deconstructed and a new value emplaced in the same memory space (via a copy or move constructor). Thus, by the time the exception is thrown, the old value is destroyed. – Goidelic 29/3, 2016 at 16:8

@RichardHodges Remember that, in principle, even a move constructor might throw, but, otherwise, you can get the strong exception guarantee via void safe_assign(auto& y, auto x) { y = move(x); }. – Sad 29/3, 2016 at 17:9

@CortAmmon I see the problem. how do you swap an X with a Y? This is solvable with 2 temporaries and by making the swap a 2-phase process: copy A to temp1, move B to temp2, destruct B, move-construct B with temp1, delete temp1 and temp2. If B's move-constructor throws, catch and move temp2 back to A. Still no need for a zombie state. – Diakinesis 29/3, 2016 at 19:21

@RichardHodges and if the move of temp2 back to A throws? – Goidelic 29/3, 2016 at 20:7

@CortAmmon fair enough. you've got me :) – Diakinesis 29/3, 2016 at 20:10

⁺¹; it's worth noting that std∷variant been implemented for C++17. – Errand 4/12, 2016 at 9:13

Inheritance?

I have to say that I do not really like this method and would not consider it modern, but it still seems to be standard.

class DoubleCell : public Cell {
    double value;

    public:
    DoubleCell( double v ) : value(v) {}
    double DoubleValue() { return value; }
    ...
};

class StringCell : public Cell {
    std::string value;

    public:
    StringCell( std::string v ) : value(v) {}
    std::string StringValue() { return value; }
    ...
};

class EmptyCell : public Cell {
    ...
};

Some of the drawbacks are:

When getting the actual value, you need to use different functions. This will usually involve using instanceof and casting.
Different objects cannot directly be put into a container, only as pointers.

Tampon answered 29/3, 2016 at 11:56 Comment(13)

This only partly answers the question. How would you get a value from such a cell ??? getValue(){return value;} – Photocathode 29/3, 2016 at 11:58

I don't think that will work, because you couldn't, for example, have an array of cells. – Quick 29/3, 2016 at 11:59

@blippy: Yes, but you can have an array of (smart) pointers to cells. – Tampon 29/3, 2016 at 12:0

Pointer semantics, dynamic memory allocations and virtual function calls for every single cell doesn't seem like a good idea to me. – Vetavetch 29/3, 2016 at 12:1

With templates you cannot have an array or vector of cells since they will separate types. – Pratte 29/3, 2016 at 12:10

Templates won't really work (at least when implemented like in the example) because the type of each cell would need to be known at compile-time. I guess you can combine both approaches by making the template derived from a common base class, but it would have the same performance overhead as the first method then. – Tantalic 29/3, 2016 at 12:15

@NathanOliver: Right, have removed the template example. – Tampon 29/3, 2016 at 12:24

@FrankPuffer I don't want to sound rude but you might want to consider just deleting the whole answer. – Pratte 29/3, 2016 at 12:29

@NathanOliver: Yes I will probably do so, but still it is the only answer so far that does not use additional libraries. That's what made me hesitate so far. – Tampon 29/3, 2016 at 12:37

@FrankPuffer The use of the library is just so they do not have to roll their own. In your example how would you get the value from the cell? Until you get that this really is only half an answer. – Pratte 29/3, 2016 at 12:39

I don't think the answer should be deleted. It is a sensible design solution, and is worthy of discussion, even if it is only to say that there are better solutions. – Quick 29/3, 2016 at 12:42

Relevant article: akrzemi1.wordpress.com/2016/02/27/another-polymorphism – Sad 29/3, 2016 at 12:57

Is it legal to construct a union of base & derived class objects, and use runtime polymorphism on the base class member? You'd get a variant-style object but with the implementation-hiding of normal runtime polymorphism. Sadly I suspect the answer is "no" because having virtual functions means the types aren't standard-layout. – Ultramundane 29/3, 2016 at 16:6

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags