What's the purpose of using a union with only one member?
Asked Answered
S

2

101

When I was reading seastar source code, I noticed that there is a union structure called tx_side which has only one member. Is this some hack to deal with a certain problem?

FYI, I paste the tx_side structure below:

union tx_side {
    tx_side() {}
    ~tx_side() {}
    void init() { new (&a) aa; }
    struct aa {
        std::deque<work_item*> pending_fifo;
    } a;
} _tx;
Sussna answered 27/11, 2019 at 9:26 Comment(10)
Potential duplicate of #26572932.Shults
@MaxLanghof This question and corresponding answers didn't mention about the purpose of using such union structure.Sussna
Have you an example for a use of this member?Estafette
That's why I didn't actually use my binding close vote. But I'm not sure what exactly you expect from answers to your question that doesn't follow directly from the answers over there. Presumably the purpose of using union instead of struct is one or more of the differences between the two. It's a pretty obscure technique so unless the original author of that code comes along I'm not sure somebody can give you an authoritative answer which problem they're hoping to solve with this (if any).Shults
@Estafette The seastar source code link in the question is the example.Sussna
@daoliker In the linked file the member _tx is not used only defined. It would maybe be helpful to see an example of actual usage of this variable.Estafette
@Estafette You can find its usage in the smp.cc or reactor.ccSussna
My best guess is that union is used to either delay construction (which is somewhat pointless in this case) or prevent destruction (which leads to memory leak) of pending_fifo. But hard to say without example of usage.Goings
Two-phase initialization, C++11 editionForemost
it's not clear where or how the deque is destroyed?Foremost
S
100

Because tx_side is a union, tx_side() doesn't automatically initialize/construct a, and ~tx_side() doesn't automatically destruct it. This allows a fine-grained control over the lifetime of a and pending_fifo, via placement-new and manual destructor calls (a poor man's std::optional).

Here's an example:

#include <iostream>

struct A
{
    A() {std::cout << "A()\n";}
    ~A() {std::cout << "~A()\n";}
};

union B
{
    A a;
    B() {}
    ~B() {}
};

int main()
{
    B b;
}

Here, B b; prints nothing, because a is not constructed nor destructed.

If B was a struct, B() would call A(), and ~B() would call ~A(), and you wouldn't be able to prevent that.

Shrug answered 27/11, 2019 at 9:52 Comment(18)
Is the memory of object b fill with random bytes before I call constructor A()?Sussna
@daoliker not necessarily random, but unpredictable by you. Same as any other uninitialized variable. You can't assume it's random; for all you know it could hold the user's password that you previously asked them to type in.Makeup
@daoliker: The previous comment is too optimistic. Random bytes would have values in the range 0-255, but if you read an uninitialized byte into an int you may get 0xCCCCCCCC. Reading uninitialized data is Undefined Behavior, and what might happen is that the compiler simply discards the attempt. This is not just theory. Debian made this exact mistake, and it broke their OpenSSL implementation. They had some real random bytes, added an uninitialized variable, and the compiler said "well the result is undefined, so it might as well be zero". Zero obviously isn't random anymore.Lyndialyndon
@MSalters: Do you have a source for this claim? Because what I can find suggests that is not what happened: it wasn't the compiler that removed it, but the developers. Honestly, I'd be amazed if any compiler writer made such an incredibly bad decision. (see #45395935 )Teresita
@JackAidley: Which precise claim? You do have a good link, seems I got the story inverted. OpenSSL got the logic wrong, and used an uninitialized variable in such a way that a compiler could legally assume any result. Debian correctly spotted that, but broke the fix. As for "compilers making such bad decisions"; they don't make that decision. The Undefined Behavior is the bad decision. Optimizers are designed to run on correct code. GCC for instance actively assumes no signed overflow. Assuming "no uninitialized data" is equally reasonable; it can be used to eliminate impossible code paths.Lyndialyndon
@JackAidley I've encountered similar issues to what @Lyndialyndon mentions in my own code; I erroneously assumed an uninitialized variable would be empty, and was baffled when a subsequent != 0 comparison yielded true. I've since added compiler flags to treat uninitialized variables as errors to make sure I won't fall into that trap again.Ophthalmitis
@MSalters: Undefined behaviour only means that it is not defined by the Standards Committee, compiler writers are typically more pragmatic than the committee and don't behave egregiously just because the committee say its UB.Teresita
@JackAidley This would be an example of clang assuming uninitialized contains zero: godbolt.org/z/s-pFbe (and unfortunately does not warn about it ...) -- correction: it actually sees it is undefined and does not even care what foo returns: godbolt.org/z/EQZmJrHurtful
@Lyndialyndon The Linux people were for a while not happy about the anal standard exegesis by the gcc team; another gotcha is aliasing.Metage
@MSalters: What term does the Standard use for non-portable but correct actions upon which it imposes no requirements?Franci
@JackAidley: People wishing to sell compilers to people who would need to write code for them behave sanely. Compiler writers who are exempt from market pressures, however, often exploit the Standard as an excuse to deride as "broken" constructs which would be non-portable but correct if written for any commercial compiler.Franci
@supercat: Provided it's correct, that would be Unspecified Behavior as opposed to Undefined Behavior.Lyndialyndon
@MSalters: The Standard uses the phrase "unspecified behavior" to indicate that an implementation may choose in arbitrary means from among a set of alternatives that is either explicitly given (e.g. f()+g() may either call f() and then g(), or call g() and then f(), but those are the only two choices) or implied (an "unspecified value" must be chosen from among the set of bit patterns a type could hold, however large or small that might be). Why would the Standard use the phrase "non-portable or erroneous" if they simply meant "erroneous"?Franci
@MSalters: Even in cases where 99% of implementation can and do process a construct identically, the Standard may still give implementations unlimited license to deviate from such behavior if the benefits of doing so would substantially exceed the benefits of following precedent, on the presumption that people seeking to sell compilers would only do so in cases that would genuinely benefit their customers. On a two's-complement platform where neither int nor unsigned has padding bits, the behavior of -1<<1 was unambiguously defined under C89. C99 recharacterized it as UB because...Franci
...on other platforms it might make sense to process such shifts in ways that might raise a signal at a time other than when the shift is performed (e.g. if the compiler holds off on performing a shift until it knows whether the result will be needed). Because any action that would allow the effects of a potential useful optimization to be observable must be classified as UB, what had been fully defined behavior on C89 was reclassified as UB without a peep in the Rationale. Was that intended to forbid the construct on two's-complement platforms without padding bits? If so, why?Franci
@supercat: While I was active in WG21 around that time, I never participated in WG14, so I can't answer that.Lyndialyndon
@MSalters: Why does the Standard specify that actions characterized as Undefined Behavior may be processed "In a documented fashion characteristic of the environment" if it does not intend that implementations extend the language with such semantics when doing so would be useful? And what do you make of N1570 5.1.2.3 paragraph 9, "An implementation might define a one-to-one correspondence between abstract and actual semantics: at every sequence point, the values of the actual objects would agree with those specified by the abstract semantics." if not intended to invite such extensions?Franci
@MSalters: Also, I'm curious what you perceive as the range of situations in which any parts of the Standard would actually have any normative authority with respect to non-trivial programs for freestanding implementations, or for freestanding implementations themselves. IMHO, the Standard's definitions of "conformance" are severely lacking, and this weakness is at the heart of most controversies regarding the Standard.Franci
E
-1

In simple words, unless explicitly assigned/initialized a value the single member union does not initialize the allocated memory. This functionality can be achieved with std:: optional in c++17.

Ejaculation answered 30/11, 2019 at 8:11 Comment(2)
That's a misleading answer. Union with only one member will have the same size as the member. This memory simply won't be initialized till the member is initialized.Puppetry
std::optional will be a bit bigger as it needs to record if it is in an engaged state.Rutharuthann

© 2022 - 2024 — McMap. All rights reserved.