std::start_lifetime_as and UB in C++23 multithreaded application
Asked Answered
P

2

7

Assuming X and Y are suitable types for such usage, is it UB to use std::start_lifetime_as<X> on an area of memory in one thread as one type and use std::start_lifetime_as<Y> on the exact same memory in another thread? Does the standard say anything about this? If it doesn't, what is the correct interpretation?

Payroll answered 20/12, 2022 at 4:48 Comment(18)
Why do threads matter? Can you do it without threads?Calumnious
@n.m. I guess OP is asking if unsynchronized start_lifetime_as is a data race.Paramaribo
@Paramaribo maybe, maybe not, but you cannot do anything with the resulting objects anyway.Calumnious
Here's a better question: is unsynchronized placement new on the same memory a data race? Because I can't find the answer to that either. [new.delete.dataraces] suggests that it's not a data race somehow, but that cannot possibly make sense, as placement new shouldn't do anything.Urina
@NicolBolas Perhaps a necessary preliminary question is what start_lifetie_as does differently from placement new.Calumnious
@n.m.: The functional difference is that it guarantees the retention of the contents of the storage. It's specified to basically work as if you initialized an object in that storage by doing a bit_cast on the data in that memory, except that no accesses are performed. New expressions do not promise to preserve the bytes in the storage.Urina
@n.m.it's my understanding (someone correct me if I'm wrong here) that where placement new always calls a constructor, start_lifetime_as may not.Payroll
@markt1964: Neither of those is true. new(memory) T; does not call a constructor of T if it is trivially default constructible. The created object is left uninitialized. And start_lifetime_as never calls a constructor. It initializes the object with the data in the storage.Urina
@NicolBolas If you cannot guarantee that no memory is modified, then you cannot guarantee that no data race exists. new.delete.datraces seem to talk about memory allocation functions and not operators.Calumnious
so then what is the difference between placement new of an object with a default trivial constructor and start_lifetime_as?Payroll
@markt1964: It's what I said earlier: the bytes already in the storage are preserved with start_lifetime_as, while they are not with placement-new. When I said "uninitialized", that doesn't mean unchanged. When an object is not initialized, its value is unspecified. start_lifetime_as specifies the value of the object.Urina
Placement new expression may modify the memory just out of spite (or for debug purposes) before calling the (do-nothing) constructor.Calumnious
@n.m. Placement new expression may modify the memory just out of spite (or for debug purposes) And this has nothing to do with data races as described in C++Merriweather
@LanguageLawyer data race exists when two unsynchronised operations conflict, which happens when one of them writes to a memory location which the other one accesses. If new expression is an operation that is allowed to write to a memory location, then it potentially conflicts with another such operation.Calumnious
@n.m. memory location is, ignoring bit fields, a scalar object. Which memory location a placement new is allowed to write, for example?Merriweather
@LanguageLawyer any location that is a subobject of an object it creates for example. Or any byte at that memory region.Calumnious
@n.m. any location that is a subobject of an object it creates for example And how does it data races with the placement new in another thread? It would need to access the same subobject. Or any byte at that memory region «byte» (of storage, I assume) is not considered an object.Merriweather
@LanguageLawyer sorry I've had my morning dose of language lawyering aready.Calumnious
P
2

Object lifetime is actually one of the more underspecified parts of the standard, especially when it comes to concurrency (and in some places the wording is outright defective IMO), but I think this specific question is answerable with what's there.

First, let's get data races out of the way.

[intro.races]/21:

The execution of a program contains a data race if it contains two potentially concurrent conflicting actions [...]

[intro.races]/2:

Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.

[intro.memory]/3:

A memory location is either an object of scalar type that is not a bit-field or a maximal sequence of adjacent bit-fields all having nonzero width.

Two unrelated objects are definitely not the same 'memory location', so [intro.races]/21 doesn't apply.

However, [intro.object]/9 says:

Two objects with overlapping lifetimes that are not bit-fields may have the same address if one is nested within the other, or if at least one is a subobject of zero size and they are of different types; otherwise, they have distinct addresses and occupy disjoint bytes of storage.

This means that out of any two (unrelated) objects with overlapping storage, at most one can be within lifetime at any given point. [basic.life]/1.5 ensures this:

The lifetime of an object o of type T ends when: [...]

  • the storage which the object occupies is released, or is reused by an object that is not nested within o.

Accessing (reading or writing) an object outside its lifetime is not allowed ([basic.life]/4), and we've just established that X and Y can't both be within lifetime at the same time. So, if both threads proceed to access the created objects, the behavior is undefined: at least one will be accessing an object whose lifetime has ended.

Phylloxera answered 20/12, 2022 at 17:5 Comment(11)
What does it mean to end the lifetime of an object from another thread whose lifetime was started in this one when no memory has been altered and therefore would not have been UB if this thread were the only one running?Payroll
It seems to me that the notion of lifetime here is referring only to the bookkeeping that the compiler does, and as far as the compiler is concerned, the lifetime of the object is still active, so why would it be UB?Payroll
@Payroll How do you know the program (in the form of the compiler) does not manage a great directory of which object type is stored at which location?Strontianite
If both objects were within lifetime, you'd be able to write to both concurrently without incurring a data race or other forms of UB. Given that they are not, if you stick to just reading, the implementation will probably behave as you'd expect, but per the standard rules, it's still UB. The C++ object model is simply not equipped to handle the case of two objects existing concurrently in the same storage.Phylloxera
If thread A starts the lifetime of an object at a region of memory, and thread B does not modify that region of memory in any way, how can the fact that thread B might try to start the lifetime of a different object at that location (again without changing memory) possibly exhibit UB in A? As you say, it will probably do what I expect, but how could it not? Again, we are talking about two threads reading here, no writing is happening. Writing can produce a data race even if two threads are working with the exact same type.Payroll
Well, you asked what the standard had to say about this situation, and from the standard's POV, whether any writing happens does not make a difference. Perhaps the authors did not consider this specific scenario important enough to carve out an exception in the rules for it. (Remember, if any writing did happen, it wouldn't be a 'data race' under the current definition, instead it's the lifetime rules that would make it UB. And those rules apply uniformly to reads and writes.)Phylloxera
But when no writing occurs, isn't object lifetime just a concept at the same scope as that which began the lifetime, and therefore isoilated to that thread? Or does lifetime inherently transcend local scope to reflect the entire machine's state? Either way, you've answered my original question.... I'm just curious.Payroll
@markt1964: If they were both in lifetime, what would happen if the threads shared pointers to those objects? We don’t want them to alias.Ayn
I would imagine that nothing would or ever even could happen, as long a neither thread makes any changes to the area of memory it occupies, and that either thread would be able to read the object that exists there without any UB, as long as that memory contains an arrangement of bits that represents an otherwise valid constructed instance of either thread's type.Payroll
@Payroll It could be UB just for the reason the standard says it is. Sometimes otherwise valid programs are kept UB by oversight or to keep design space open for future standards. Perhaps in a future C++ the threads are executing less independent then they do now. C++ is also run on devices like GPUs or even sometimes transformed to HDLs (hardware definition language). One should be able to reason about objects in memory, which is more difficult for different objects at the same memory location. If neither memory location is written to, then the program is not very useful.Strontianite
presumably, the memory location could be written to already, and two different threads running at a later time would be trying to access the same data that was there, each in their own and possibly different ways. I had previously thought that start_lifetime_as was basically the same thing as reintrepret_cast, but differs in that the former explicitly tells the compiler that the lifetime of the object has started and therefore less likely to result in UBPayroll
A
7

There is no data race from such calls, since none of them access any memory locations, but since (without synchronization) neither thread can know that the other has not ended the lifetime of its desired object by reusing its storage for an object of the other type, the objects created cannot be used. (There are not “even odds” that one thread can use them because it “went last”: there is an execution where it didn’t, so relying on that would have undefined behavior.)

Ayn answered 20/12, 2022 at 8:19 Comment(4)
Could you please clarify? I would like to understand how the lifetime of something can end just because the lifetime of something else started in another thread at the same memory area when no memory has actually been changed? Why would it be UB to use either object when neither thread mutates the bits in the memory, and it would not be UB if the other thread had not been running?Payroll
@markt1964: [basic.life]/1.5 describes ending lifetimes via storage reuse, and /4 prohibits most use of the objects that might be out of lifetime. With just one thread, the lifetime doesn’t end, so it’s fine.Ayn
I don't think there are even odds that one of the objects is alive in the first place. In hardware, you might get lucky and avoid data races by coincidence, but from a C++ standard perspective, any two writes to the same memory location are a data race if neither happens before the other. This is the case even if in real time, the writes are 10 hours apart. I assume that the same principle should apply to object lifetimes, where if two object lifetimes begin in the same storage and neither happens before the other, this is in itself UB.Matlock
CWG1953 presents a very similar problem. Unfortunately we don't have robust wording to explain what happens in these weird "simultaneous lifetime" cases, except that [intro.object] p9 claims that they cannot occur (though this is contradicted by other, defective wording).Matlock
P
2

Object lifetime is actually one of the more underspecified parts of the standard, especially when it comes to concurrency (and in some places the wording is outright defective IMO), but I think this specific question is answerable with what's there.

First, let's get data races out of the way.

[intro.races]/21:

The execution of a program contains a data race if it contains two potentially concurrent conflicting actions [...]

[intro.races]/2:

Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.

[intro.memory]/3:

A memory location is either an object of scalar type that is not a bit-field or a maximal sequence of adjacent bit-fields all having nonzero width.

Two unrelated objects are definitely not the same 'memory location', so [intro.races]/21 doesn't apply.

However, [intro.object]/9 says:

Two objects with overlapping lifetimes that are not bit-fields may have the same address if one is nested within the other, or if at least one is a subobject of zero size and they are of different types; otherwise, they have distinct addresses and occupy disjoint bytes of storage.

This means that out of any two (unrelated) objects with overlapping storage, at most one can be within lifetime at any given point. [basic.life]/1.5 ensures this:

The lifetime of an object o of type T ends when: [...]

  • the storage which the object occupies is released, or is reused by an object that is not nested within o.

Accessing (reading or writing) an object outside its lifetime is not allowed ([basic.life]/4), and we've just established that X and Y can't both be within lifetime at the same time. So, if both threads proceed to access the created objects, the behavior is undefined: at least one will be accessing an object whose lifetime has ended.

Phylloxera answered 20/12, 2022 at 17:5 Comment(11)
What does it mean to end the lifetime of an object from another thread whose lifetime was started in this one when no memory has been altered and therefore would not have been UB if this thread were the only one running?Payroll
It seems to me that the notion of lifetime here is referring only to the bookkeeping that the compiler does, and as far as the compiler is concerned, the lifetime of the object is still active, so why would it be UB?Payroll
@Payroll How do you know the program (in the form of the compiler) does not manage a great directory of which object type is stored at which location?Strontianite
If both objects were within lifetime, you'd be able to write to both concurrently without incurring a data race or other forms of UB. Given that they are not, if you stick to just reading, the implementation will probably behave as you'd expect, but per the standard rules, it's still UB. The C++ object model is simply not equipped to handle the case of two objects existing concurrently in the same storage.Phylloxera
If thread A starts the lifetime of an object at a region of memory, and thread B does not modify that region of memory in any way, how can the fact that thread B might try to start the lifetime of a different object at that location (again without changing memory) possibly exhibit UB in A? As you say, it will probably do what I expect, but how could it not? Again, we are talking about two threads reading here, no writing is happening. Writing can produce a data race even if two threads are working with the exact same type.Payroll
Well, you asked what the standard had to say about this situation, and from the standard's POV, whether any writing happens does not make a difference. Perhaps the authors did not consider this specific scenario important enough to carve out an exception in the rules for it. (Remember, if any writing did happen, it wouldn't be a 'data race' under the current definition, instead it's the lifetime rules that would make it UB. And those rules apply uniformly to reads and writes.)Phylloxera
But when no writing occurs, isn't object lifetime just a concept at the same scope as that which began the lifetime, and therefore isoilated to that thread? Or does lifetime inherently transcend local scope to reflect the entire machine's state? Either way, you've answered my original question.... I'm just curious.Payroll
@markt1964: If they were both in lifetime, what would happen if the threads shared pointers to those objects? We don’t want them to alias.Ayn
I would imagine that nothing would or ever even could happen, as long a neither thread makes any changes to the area of memory it occupies, and that either thread would be able to read the object that exists there without any UB, as long as that memory contains an arrangement of bits that represents an otherwise valid constructed instance of either thread's type.Payroll
@Payroll It could be UB just for the reason the standard says it is. Sometimes otherwise valid programs are kept UB by oversight or to keep design space open for future standards. Perhaps in a future C++ the threads are executing less independent then they do now. C++ is also run on devices like GPUs or even sometimes transformed to HDLs (hardware definition language). One should be able to reason about objects in memory, which is more difficult for different objects at the same memory location. If neither memory location is written to, then the program is not very useful.Strontianite
presumably, the memory location could be written to already, and two different threads running at a later time would be trying to access the same data that was there, each in their own and possibly different ways. I had previously thought that start_lifetime_as was basically the same thing as reintrepret_cast, but differs in that the former explicitly tells the compiler that the lifetime of the object has started and therefore less likely to result in UBPayroll

© 2022 - 2024 — McMap. All rights reserved.