Why std::get for variant throws on failure instead of being undefined behaviour?
Asked Answered
E

4

26

According to cppreference std::get for variant throws std::bad_variant_access if the type contained in the variant is not the expected one. This means that the standard library has to check on every access (libc++).

What was the rationale for this decision? Why is it not undefined behavior, like everywhere else in C++? Can I work around it?

Exostosis answered 15/2, 2018 at 22:7 Comment(4)
@Justin I do not think it is a true duplicate. There is no answer 'why'. Second of all, there is actually no answer for 'can I work around it'. I am nominating the question for reopening.Karlee
Because that's what std::variant is for: 'a type-safe union'. If you don't want it type-safe, or want UB, don't use it: use a union.Chandigarh
In this thread, some people give some motivation behind why std::variant might not have a std::unchecked_get. I don't know if that's really what was discussed in the standards meetings, but there is logic behind the reasoningTipper
@MarquisofLorne Then half of the STL should be removed too, because there is UB everywhere and you can always implement it yourself.Dronski
E
1

I think I found it!

Seems like the reason can be found under the "Differences to revision 5" in the proposal :

The Kona compromise: f !v.valid(), make get<...>(v) and visit(v) throw.

Meaning - that the variant has to throw in "values_by_exception" state. Using the same if we can always throw.

Even knowing this rational I personally would like to avoid this check. The *get_if work around from Justin's answer seems good enougth for me (at least for library code).

Exostosis answered 16/2, 2018 at 23:10 Comment(0)
T
16

The current API for std::variant has no unchecked version of std::get. I don't know why it was standardized that way; anything I say would just be guessing.

However, you can get close to the desired behavior by writing *std::get_if<T>(&variant). If variant doesn't hold T at that time, std::get_if<T> returns nullptr, so dereferencing it is undefined behavior. The compiler can thus assume that the variant holds T.


In practice, this isn't the easiest optimization for the compiler to do. Compared to a simple tagged union, the code it emits may not be as good. The following code:

int const& get_int(std::variant<int, std::string> const& variant)
{
    return *std::get_if<int>(&variant);
}

Emits this with clang 5.0.0:

get_int(std::variant<int, std::string> const&):
  xor eax, eax
  cmp dword ptr [rdi + 24], 0
  cmove rax, rdi
  ret

It is comparing the variant's index and conditionally moving the return value when the index is correct. Even though it would be UB for the index to be incorrect, clang is currently unable to optimize the comparison away.

Interestingly, returning an int instead of a reference optimizes the check away:

int get_int(std::variant<int, std::string> const& variant)
{
    return *std::get_if<int>(&variant);
}

Emits:

get_int(std::variant<int, std::string> const&):
  mov eax, dword ptr [rdi]
  ret

You could help the compiler by using __builtin_unreachable() or __assume, but gcc is currently the only compiler capable of removing the checks when you do so.

Tipper answered 15/2, 2018 at 23:20 Comment(5)
are you looking at the other decompiled overload of get_int() in your own example? The variant-based one at disassembly line 1 does emit a tag check. The tagged union one does not. \n Also note Praetorian had a very good points about non-trivially-constructible types in the comments to the other answer.Fons
@kkm Yes. I put the tagged union there for comparison. It's clear that the tagged union emits better code, but in theory, this std::get_if method should work. The compilers have all the information they need.Tipper
@kkm Are you referring to my second assembly block? That's not in my posted example, but you can easily get it by changing the int const& to int. I'll add a linkTipper
Indeed the compiler does, and I also agree that the compiler assuming some invariants that make the program's behavior not undefined would not be unhelpful (and, as a side note, the general problem of inferring such invariants is likely undecidable :) ). \n Yes, adding a link would improve the readability and clarity of your answer IMO!Fons
Yes, this works and I like the *get_if trick, thanks. Real shame that we have to do it though.Exostosis
D
9

Why it's not undefined behavour, like everywhere else in c++? Can I work around it?

Yes, there is a direct workaround. If you do not want type safety, use a plain union instead of a std::variant. As it says in the reference you cited:

The class template std::variant represents a type-safe union.

The purpose of union was to have a single object that could take values from one of multiple different types. Only one type of the union was 'valid' at any given time depending on which member variables had been assigned:

union example {
   int i;
   float f;
};

// code block later...
example e;
e.i = 10;
std::cout << e.f << std::endl; // will compile but the output is undefined!

std::variant generalized a union while adding type safety to help make sure you are only accessing the right data type. If you do not want this safety, you can always use a union instead.

What was the rational for this decision?

I do not know personally what the rationale was for this decision, but you can always take a look at the papers from the C++ standardization committee to get some insight into the process.

Demars answered 15/2, 2018 at 22:52 Comment(10)
Are you serious? If it were that simple to replace a variant with a union, why would the former exist?Nassi
I thought that was clear -- to create a type-safe version of union. When writing unions in C, I remember always pairing a union object with an int or enum so that I could store information about which type in the union was set. Otherwise, I would risk UB if I read from the wrong data member. Here, std::variant provides the error handling and type recall for a union-type object, so you don't have to implement that yourself (other than checking that the type is correct).Demars
I think there were some other interesting use cases for union depending on the data types contained which std::variant does not still handle, but I am not sure those parts about unions in the C standard were kept in the C++ standard.Demars
I do not understand the downvotes. variant is a union plus type safety; naturally then, variant minus type safety is a union. I must upvote this answer, but @DanielDay please fix the language a bit. You mean to say "if you do not want type safety, use plain union instead of variant", but you actually say the reverse in the first sentence.Fons
@kkm If all you care about is storing ints, floats etc. then this is an easy replacement to make. But what if one of the alternatives is std::string, or any other non-trivial type? You end up with something like this. You also lose other nice features like visitation. Yes, you could write all of it yourself, but then that can be the answer to any question that asks for an alternate way of doing something.Nassi
@kkm: It is not that black and white; by switching to union, you give up a lot besides just that type checking. However I agree with Daniel that this is at least a valid workaround if you're really desperate for it.Stirpiculture
@Praetorian: yes, I indeed agree about the non trivially constructible type part. Entirely missed this point, thanks!Fons
@Nassi this is true, although you could get around this limitation with non-trivial types by using a pointer to the non-trivial type. But you would then have to dynamically allocate memory for that type to use it within a union where std::variant can handle non-trivial types without additional allocation. I agree that the functionality is not a one to one match, but a union would be a work around for the OP's original question as statedDemars
@DanielDay, managing pointers in a variant type is not same as using types or even smart pointers. Too much non-trivial cleanup would be involved; this is certainly not salvaging the proposed solution. Since the reasoning goes this far, I should be resetting the upvote.Fons
I'm sorry, I must have not been clear enough. It's really hard to do variant with union - if you really need tags & staff. (consider for example computing the size of the required index type or visitation). What I meant was - I don't care much for exceptions, for me throw <=> terminate. I'm OK with UB instead of terminate if I messed up. I don't want an extra check when I'm out. I want get<...> with a wrong type to be a contruct violation. Consider: vector is a type safe dynamic array and optional is a type safe union - this doesn't mean that [] out of range or operator* cannot produce UB.Exostosis
D
3

What was the rationale for this decision?

This kind of question is always difficult to answer, but I'll give it a shot.

A lot of the inspiration for the behavior of std::variant came from the behavior of std::optional, as stated in the proposal for std::variant, P0088:

This proposal attempts to apply the lessons learned from optional...

And you can see the parallels between the two types:

  • You're not sure what's currently being held
    • in optional it's either a type or nothing (nullopt_t)
    • in variant it's either one of many types, or nothing (see valueless_by_exception)
  • All functions to operate on the type are marked constexpr
    • This may seem coincidental or just good design practices, but it was very clearly intended that variant follow optional's lead on this (see the linked proposal above)
  • They each provide a way to check for emptiness
    • std::optional has an implicit conversion to bool, or alternatively the has_value function
    • std::variant has valueless_by_exception which tells you if the variant is empty because constructing the active type threw an exception
  • They each provide a way for a throwing and non-throwing access
    • Potentially-throwing access for std::optional is value and it may throw bad_optional_access
    • Potentially-throwing access for std::variant is get and it may throw bad_variant_access
    • Non-throwing (I use the term a bit loosely) access for std::optional is value_or which may return you an alternative (that you pass in) if the optional is empty
    • Non-throwing access for std::variant is get_if which returns a nullptr if the index or type provided is bad.

Indeed the similarities were so intentional, that an inconsistency in the base classes used for optional and variant were cause for complaint (see this Google Groups discussion)

So to answer you question, it throws because optional throws. Bear in mind that the throwing behavior should be rarely encountered; you should use a visitor pattern with a variant, and even if you do call get it only throws if you provide it an index that is the size of the type list, or the requested type is not the active one. All other misuses are considered ill-formed and should issue a compiler error.


As for why std::optional throws, if you check its proposal, N3793 having a throwing accessor was advertised as an improvement over Boost.Optional, from which std::optional was born. I haven't yet found any discussion about why this is an improvement so for now I'll speculate: it was easy to provide both throwing and non-throwing accessors that satisfy both error-handling camps (sentinel values vs exceptions), and it additionally helps take some undefined behavior out of the language so you don't needlessly shoot yourself in the foot if you choose to go the potentially-throwing route.

Dupaix answered 16/2, 2018 at 0:3 Comment(6)
valueless_by_exception is not at all comperable to an empty optional. An optional being empty is a legitimate state; the object is completely functional (within its contract). A variant being valueless is not in a legitimate state. It can only get into that state via throwing an exception, and you pretty much can't do anything with a valueless variant. No visitation, no get, nothing.Interbrain
Also, the whole argument fails, because while optional::value throws, optional::operator* does not.Interbrain
@NicolBolas point taken about valueless_by_exception, but isn't "whole argument fails" a little extreme? I feel that the proposal makes it clear that the design was inspired by optional. variant has no comparable ` operator*` so it's hard to say that the behavior diverges thereDupaix
@Nicol the non throwing operator* is rather an optimization similar to vector::at vs vector::operator[]. The fact that no such optimization exists for variant could come from the fact that variant is supposed to protect you from invalid access (by throwing) in the first place, no?Iseult
@rubenvb: Which is my point. The choice not to have a non-checking get has nothing to do with optional; it purely has to do with the design of variant.Interbrain
@andyg - thank you! You are almost there, but as other commenters mentioned - I'd really like to know why no "operator*" like method.Exostosis
E
1

I think I found it!

Seems like the reason can be found under the "Differences to revision 5" in the proposal :

The Kona compromise: f !v.valid(), make get<...>(v) and visit(v) throw.

Meaning - that the variant has to throw in "values_by_exception" state. Using the same if we can always throw.

Even knowing this rational I personally would like to avoid this check. The *get_if work around from Justin's answer seems good enougth for me (at least for library code).

Exostosis answered 16/2, 2018 at 23:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.